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^ (54) Title: VIRULENCE GENES OFM MARIN UM AND M. TUBERCULOSIS 

(57) Abstract: Methods for identifying, isolating and mutagenizing virulence genes of mycobacteria, e.g., M. marinum and M. tu- 
Q berculosis, are described. Also described are isolated virulence genes and fragments of them, isolated gene products and fragments 
of them, avirulent bacteria in which one or more virulence genes are mutagenized, attenuated vaccines containing such mutant bac- 
teria, and methods to elicit an immune response in a host, using such mutant bacteria. 
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VIRULENCE GENES OF M. MARINUM and M. TUBERCULOSIS 

Background of the Invention 

Mycobacteria are bacterial organisms which are implicated in diseases such as, e.g., 
tuberculosis. It would be desirable to provide means for treating or preventing conditions 
caused by such mycobacteria, e.g., by immunization. 

Description of the Invention 

This invention relates, e.g., to virulence genes of mycobacteria. The invention 
provides methods to identify and isolate virulence genes of, for example, Mycobacterium 
marinum, a fish bacterium, and Mycobacterium tuberculosis, the primary etiologic agent 
of human tuberculosis. The invention also provides methods to mutagenize such virulence 
genes, thereby allowing the generation and isolation of avirulent mycobacteria. The 
invention also relates to isolated virulence genes and variants and fragments thereof; to 
isolated virulence gene products and variants and fragments thereof; to mutant, avirulent, 
bacteria; to attenuated vaccines comprising the mutant bacteria; and to methods to elicit 
an immune response in a host, using such mutant bacteria. 

One embodiment of the invention is a method for identifying a virulence gene of 
M. marinum, comprising 

a) mutagenizing M. marinum bacteria by introducing into said bacteria a plasmid 
which comprises a tagged {e.g., signature-tagged) transposon, whereby the transposon 
integrates into and disrupts a gene in the bacteria, 

b) introducing said mutagenized bacteria into a host susceptible to infection thereof 
(e.g., a goldfish), 

c) identifying a mutagenized bacterium which comprises a tagged transposon and 
which exhibits reduced viability in the host, compared to other mutagenized or (non- 
mutagenized) M. marinum bacteria, 

d) cloning and/or sequencing (characterizing) a nucleic acid sequence which flanks 
the integrated transposon in said identified mutagenized bacterium, and 

e) identifying a wild type M. marinum gene which comprises at least a portion of 
said flanking sequence. 

Of course, the above method can be carried out using one or more of the steps, in 
any order, effective to achieve the intended purpose. 



WO 01/19993 



PCTAJS00/25512 



-2- 

Another embodiment is a method for identifying a virulence gene of M 
tuberculosis, comprising identifying an M. marinum virulence gene as described above, 
and further comprising, 

comparing said flanking nucleic acid sequence to a databank of M. tuberculosis 
5 nucleic acid sequences, and/or comparing the sequences of peptides which are coded for 
by said flanking sequences to a known M tuberculosis protein database, and 

identifying an M. tuberculosis gene which comprises a sequence that is 
substantially identical to said flanking sequences and/or polypeptides encoded by them. 
In other embodiments, the degree of identity can be less than substantially identical, e.g., 
1 0 about 35-50%, or about 50-70%, or about 70-90%. 



Another embodiment is a method for isolating a mutagenized M. marinum 
bacterium which exhibits reduced virulence in a host susceptible to infection thereof 
compared to a non-mutagenized M. marinum bacterium, comprising integrating a tagged 
(e.g., signature-tagged) transposon into the DNA of a M. marinum bacterium in a manner 
15 effective to produced reduced virulence, and isolating said mutagenized bacterium. 

Another embodiment is an avirulent M. marinum bacterium in which one or more 
genes comprising a nucleic acid of SEQ ED NOs: 4, 6, 8, 10, 1 1, 13, 17, 21, 23, 25, 27, 29, 
31, 35, 39, 41, 43 or 44 are mutated. Another embodiment is a pharmaceutical 
composition or an attenuated vaccine comprising such an avirulent M marinum bacterium 
20 and a pharmaceutical^ acceptable carrier. 

Another embodiment is an avirulent M. tuberculosis bacterium in which one or 
more virulence genes identified as described above are mutated. Another embodiment is 
an avirulent M. tuberculosis bacterium in which one or more of the genes encoding 
proteins Rv0822c, CY20G9.23 (Rv0497), the pks family, including e.g., ppsE (Rv2935), 
25 psk6 (Rv0405), pks9 (Rvl664), pks8 (Rvl662), pksl (Rv2946c), and pks002c, Rv3511, 
008381 (Rv0357c), Rv3775, Rv3137, Rv2348c, Rv3860, mbtB (Rv2383c), Rv2181, 
Rvl954c, Rv0987, Rv3268, Rv2610c, nrp (pir E70751, RvOlOl), mbtE (Rv2380c), 
Rv0236c or smc (Rv2922c) are mutated. Another embodiment is a pharmaceutical 
composition or an attenuated vaccine comprising one or more of the above avirulent M. 



WO 01/19993 



PCT/US00/25512 



-3- 

tuberculosis bacteria (e.g., an Af. tuberculosis strain constructed with one or more 
mutations in one or more of the above virulence genes) and a pharmaceutically acceptable 
carrier. 

Another embodiment is an isolated nucleic acid of M marinum comprising an 
oligonucleotide of SEQ ID NOs: 4, 6, 8, 10, 1 1, 13, 17, 21, 23, 25, 27, 29, 31, 35, 39, 41, 
43 or 44, or a variant or fragment thereof. Another embodiment is a nucleic acid which 
is complementary to at least a portion of said isolated M. marinum nucleic acid, or which 
can hybridize to at least a portion of said isolated M. marinum nucleic acid under selected 
(e.g., high) stringency conditions. In other embodiments, the isolated M marinum nucleic 
acid is a gene; or the isolated Af. marinum nucleic acid or fragments thereof are cloned 
into, and/or expressed in, an expression vector. 

Another embodiment is an isolated nucleic acid of M tuberculosis, comprising a 
virulence gene identified as above, or a variant or fragment thereof. Another embodiment 
is a nucleic acid which is complementary to at least a portion of said isolated M 
tuberculosis nucleic acid, or which can hybridize to at least a portion of said isolated M. 
tuberculosis nucleic acid under selected (e.g., high) stringency conditions. In other 
embodiments, the isolated M tuberculosis nucleic acid or fragments thereof are cloned 
into, and/or expressed in, an expression vector. 

Another embodiment is a method to elicit an 4 immune response in a fish, 
comprising introducing into the fish an avirulent M marinum bacterium made (e.g., 
isolated, constructed) as described above. Another embodiment is a method to elicit an 
immune response in a human or non-human animal (e.g., domestic or farm animal, such 
as a cow) host, comprising introducing into said host an avirulent, M. tuberculosis 
bacterium, in which one or more virulence genes identified as described above are 
mutated. Another embodiment is a method to elicit an immune response in a human host, 
comprising introducing into such host an avirulent M tuberculosis bacterium in which one 
or more of the genes encoding proteins Rv0822c, CY20G9.23, the pks family of proteins, 
Rv351 1, 008381 , Rv3775, Rv3137, Rv2348c, Rv3860, mbtB, Rv2181, Rvl 954c, Rv0987, 
Rv3268, Rv2610c, nrp (pir E70751), mbtE, Rv0236c or smc is mutated. 
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A wide variety of Mycobacteria species can be used in the invention. In a most 
preferred embodiment, the bacterium is Mycobacterium marinum (M marinum), which 
causes fish tuberculosis, as well as, in humans, skin infection or localized nodular and 
ulcerated lesions (mariner's tuberculosis) on the extremities and, in immunocompromised 
5 patients, systemic disease; Mycobacterium tuberculosis (M. tuberculosis), the primary 
etiologic agent for tuberculosis (TB) in man; or Mycobacterium bovis (M bovis) 9 which 
causes human or bovine tuberculosis. Other species of Mycobacterium which can be used 
in the invention include, e.g., M. bovis BCG, M. africanum, M. leprae, M. microti, M. 
smegmatis, M. vaccae, M. ulcer ans, M. haemophilum, M. fortuitum, M. chelonae, and 
10 others. 

The term "virulent" in the context of mycobacteria refers to a bacterium or strain 
of bacteria that replicates within a host cell or animal within the mycobacterium host range 
at a rate which is detrimental to the cell or animal, or that induces a host response which 
is detrimental. More particularly, virulent mycobacteria persist longer in a host than 
avirulent bacteria. Virulent mycobateria are typically disease producing; and infection 
leads to various disease states including fulminant disease in the lung, disseminated 
systemic milliary tuberculosis, tuberculosis meningitis, and/or tuberculosis abscesses of 
various tissues. Infection by virulent mycobacteria often results in death of the host 
organism. 

By contrast, the term "avirulent," as used herein, refers to a bacterium or strain of 
bacteria that does not replicate within a host cell or animal within its host range; replicates 
at a rate which is not significantly detrimental to the cell or animal; and/or does not induce 
a detrimental host response. An avirulent {e.g., attenuated, non-pathogenic) strain is 
incapable of inducing a full suite of symptoms of the disease that is normally associated 
with its virulent pathogenic counterpart. Avirulent bacteria exhibit a reduced ability, or 
an inability, to survive in a host, but not all bacteria which exhibit such an impaired ability 
to survive in a host are avirulent. For example, in a simultaneous in vivo test of several 
mutant bacteria, certain mutants which are unable to compete with other mutants may not, 
when tested in the presence of the other strains, replicate efficiently or survive in the host; 
however, such bacteria, when tested individually, may prove to be virulent. An avirulent 
bacterium can contain one or more mutations in one or more virulence genes. 
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A "virulence gene" encodes a gene product ('Virulence factor, virulence 
determinant") which contributes, directly or indirectly, to infection {e.g., attachment, 
invasion, transport into the cell, replication, etc.) and/or to tissue destruction and/or 
disease. A virulence gene can code for or modify, e.g., an adhesion molecule or other 
5 molecule which aids in the attachment to or invasion of a host cell; a toxin {e.g. , a secreted 
factor which can cause lysis or damage of a host cell - for example, a small molecule such 
as a polyketide, or an enzyme such as a phospholipase, lipase, esterase or protease); a 
factor required for efficient secretion of such a toxin; a factor involved in intracellular 
multiplication or growth; a factor involved in resistance to host defenses; a factor which 

1 0 can stimulate a host cell to produce an inflammatory product or cytokine that can ampli fy 
tissue damage in a host; or a factor which regulates the production and/or activity of a 
virulence factor. Also included are certain functions which resemble "housekeeping" 
functions, e.g., functions which allow bacteria to provide nutrients that are limiting in a 
host, such as factors which aid in the acquisition of iron, or certain enzymes of purine or 

15 pyrimidine biosynthesis. For a review of some of the putative or suspected virulence 
determinants of Mycobacterium tuberculosis , see Quinn et al (1996). Curr. Top. 
Microbiol. Immunol. 215, 131-156. 

By a "host" for a bacterium is meant an organism, or a cell or tissue of an 
organism, which can be infected by the bacterium and which exhibits consequences of that 

20 infection. For example, Mycobacterium marinum can infect and cause symptoms in the 
frog (Rana pipiens) or in any of about 150 fresh-water or salt-water species of fish. In an 
especially preferred embodiment, the host for Mycobacterium marinum is the goldfish, 
Carassius auratus. Well-established animal models for M. tuberculosis include, e.g., 
guinea pig, mouse, rabbit and monkey; and many natural hosts exist for that bacterium, 

25 including large animals such as the elephant. Many other bacteria/host combinations are 
possible. See, e.g., B. Bloom, ed., (1994). Tuberculosis: Pathogenesis, Protection, and 
Control, ASM Press, Washington, D.C. Chapter 1 1 , for a discussion of tuberculosis in wild 
and domestic animals. 

A system in which goldfish are infected by M. marinum (the "goldfish model") 
30 offers a number of advantages for experimental studies. For example, M. marinum has a 
generation time of only 4 hours (as compared, e.g., to the greater than 20 hour generation 
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time of Af. tuberculosis), and studies with Af. marinum can be carried out in a Biosafety 
Level 2 facility (whereas a Biosafety Level 3 facility is required, e.g., for studies with Af. 
tuberculosis). Af. marinum can serve as an appropriate surrogate model for the study of 
Af. tuberculosis. Af. marinum and the Af. tuberculosis complex have been shown to be 
5 closely related by, e.g., DNA hybridization and 16S rRNA gene sequence analysis (see, 
e.g., Tonjum et al (1998). J. of Clinical Microbiology 26, 918-925). The disease 
progression and symptoms of fish infected with Af. marinum mimic those of humans 
infected with Af. tuberculosis: in both types of hosts, organs in all parts of the body can be 
infected; both bacteria replicate within macrophages and reside in an endosomal 
10 compartment which is nonacidic and does not fuse with the lysosomal compartment; and 
both bacteria readily kill macrophages. 

Examples IB and 1C show, e.g., that the pathology in the goldfish model parallels 
that of human tuberculosis. Depending on the dose of Af. marinum organisms which is 
inoculated into a fish, acute or chronic disease is elicited. The pathology of the acute 

15 disease includes severe peritonitis and necrosis with all animals dying within 1 7 days of 
infection. The pathology of the chronic disease includes progressive granuloma formation. 
Granulomas with different histopathological features (necrotizing, non-necrotizing and 
caseous) are seen in the experimentally infected goldfish, which is consistent with the 
granuloma types seen in naturally infected animals and parallels the types of granulomas 

20 found in human tuberculosis. Isolation of Af. marinum from fish tissue is possible 
throughout the course of the experiment presented in Example 1 (up to 1 6 weeks) 
indicating, as in human tuberculosis, the persistence of the organisms in the host. Example 
2 shows that the goldfish model can be used to distinguish virulent and avirulent forms of 
Af. marinum. Further disclosure of how to make the goldfish model, and how to use it, 

25 e.g., to characterize molecular pathogenesis, can be found, e.g., in Talaat A.M. et al 
(1998). tnfection and Immunity 66, 2938-2942. 

As an initial step in isolating virulence mutants, bacteria, e.g., Af. marinum, can be 
mutated by any of a variety of routine procedures which are well-known in the art, e.g., 
exposure to chemical agents, irradiation, genetic engineering, transposon mutagenesis, or 
30 the like. As used in this application, the term a "mutation" means any change (in 
comparison with the appropriate parental strain) in the DNA sequence of an organism, e.g., 
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a single (or multiple) base change, insertion, deletion, inversion, translocation, duplication, 
or the like. A mutation can be polar or non-polar, a frameshift or in phase. Preferably, in 
particular when a mutated bacterium is used as part of a treatment regimen or a vaccine, 
the mutation is substantially incapable of reverting to the wild type. 

5 In a most preferred embodiment, mutagenesis is carried out by a transposon 

mutagenesis system that carries sequence-specific tags, sometimes known as signature- 
tagged mutagenesis (STM). The unique tag sequence allows differentiation of individual 
mutants among an inoculum pool of mutants. The STM protocol permits the screening of 
a large number of mutants using a small number of animals. This method was developed 

10 by Hensel et al (Hensel et al (1995). Science 262, 400-403; U.S. Pat. No. 5,876,931 to 
Holden). Variations of the method and procedures for using it to isolate bacterial virulence 
mutants are also disclosed in, e.g., Shea et al (1996). Proc. Natl. Acad. Sci. 22, 2593- 
2597; Meiefa/(1997). Mol Microbiol 26,399-407; Schwane*a/( 1998). Infeclmmun. 
66, 567-572; and Chiang et al (1998). Mol Microbiol 22, 797-805. Example 3 shows the 

15 use of the STM system for the mutagenesis of M. marinum. 

Any of a variety of methods can be used to generate a bank of plasmids carrying 
unique signature-tagged transposons. A most preferred embodiment is shown in Example 
3 A. Here, 96 independent, non-cross-hybridizing, signature-tagged transposons, each of 
which is hybridization- and amplification-efficient, are cloned into a mycobacteria suicide 
20 vector which carries a selectable marker. Many variants of such vectors, carrying any of 
a variety of selectable markers, can be used, of course. In example 3 A, the marker is a 
kanamycin-resistance gene. 

To generate a mutant mycobacterium library, plasmids from a master plasmid 
collection are introduced individually (e.g., separately) into mycobacteria, preferably M. 

25 marinum, by any of a variety of routine, art-recognized techniques (e.g., phage 
transduction, shooting a "gene gun," electroporation, or other conventional techniques). 
In a most preferred embodiment, as shown in Example 3C, plasmids are introduced into 
M. marinum by electroporation. Any desired number of transformed bacteria can be 
selected from each transformation. In Example 3C, ninety-six transformations are 

30 performed, one with each of the 96 master plasmids; and ten independent transformants 
are selected from each transformation, to yield a library of 960 transformants. As Example 
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3B shows, the transposons integrate randomly into the M. marinum chromosome. In the 
ideal circumstance, each integrated transposon disrupts a different gene, or a different 
portion thereof, to create a library of, in this example, 960 differently mutagenized 
bacteria. 

5 Pools of mutagenized bacteria, each of which can be detected independently by 

virtue of its unique signature tag, are introduced into an appropriate host, e.g., a goldfish 
(an "input pool"). Bacteria may be introduced into an animal by any route, e.g., orally, 
intraperitoneal^, intravenously or intranasally; for fish, the preferred routes of 
administration are oral or, most preferably, intraperitoneal. It may be useful to compare, 

10 e.g., virulence genes identified by oral administration to those identified by intraperitoneal 
administration, as some genes may be required to establish infection by one route but not 
by the other. Bacteria are left in the host for a suitable length of time, which is a function 
of both the microorganism and the host. A method for optimization of some of the 
infection parameters for the M. marinum/goldfish system is shown, e.g., in Examples 1 and 

15 2. 

Assays are performed to determine whether the bacteria are able to survive in the 
host during the period of infection. Any of a variety of such assays can be used, e.g., 
subtractive hybridization, differential display, or the like. In a most preferred embodiment, 
as shown in Example 4A, after an optimized period of infection by a pool of M. marinum 

20 mutants, fish are sacrificed and one or more internal organs, e.g., spleen, liver, kidney, 
peritoneum, heart, pancreas, or other organs evident to one of skill in the art, are cultured 
to isolate the mutant bacteria which were able to survive in the fish, defined as the output 
pool. A hybridization protocol to identify mutants present in the input and output pools 
is described in Example 4A. Mutants which are present in the input pool, but which 

25 cannot be detected after a predetermined time of infection has elapsed in the output pool, 
are candidates for avirulent mutants, Le., mutants which are unable to infect, replicate 
and/or cause damage, in a particular cell type or tissue. 

In order to confirm that an M. marinum mutant is avirulent, each putative virulence 
mutant can be re-examined individually, e.g., in the goldfish model. In a preferred 
30 embodiment, the median survival time (MST) of goldfish infected with a lethal dose (about 
5xl0 8 cfu) of a putative virulence mutant can be determined, and those mutants which 
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allow goldfish to survive longer than fish inoculated with an equivalent dose of wild type 
organisms are categorized as putative virulence mutants. Many other types of screening 
assays can be used, including Competitive Indices, histopathology examinations of one or 
more of the organs described above, colony counts in organ homogenates, and analysis of 
5 the ability of a mutant to induce granuloma formation. Representative protocols for each 
of these methods are described, e.g., in Example 4B. In addition to confirming the 
existence of a virulence mutant, data collected on each mutant can yield clues to the 
pathogenesis pathways of M. marinum in the goldfish model. Methods to show that 
Koch's postulates have been fulfilled (proving that a postulated virulence gene is 
10 responsible for disease symptoms) are routine; one such method is presented in Example 
8. 

Alternative approaches to the STM technique can be used to identify avirulent M. 
marinum mutants. For example, one can screen a library of M. marinum cosmids in M 
smegmatis. In the goldfish model, M. smegmatis does not persist in tissue when 

15 inoculated at a dose of 10 7 organisms/fish. This is in contrast to A/, marinum, which can 
be isolated from fish tissue throughout the course of a 56 day experiment. In this 
alternative approach, one can inject the fish with pools of the Af. marinum cosmids in M 
smegmatis and look for those which survive in the animal. A library of M. marinum 
cosmids in Af. smegmatis can be obtained routinely, using standard, art-recognized 

20 procedures. 

Once an insertionally mutated M marinum bacterium has been identified as being 
a (putative) virulence mutant, a wild type M. marinum can be engineered to contain a more 
well-defined (e.g., non-polar) mutation. The introduction of such a well-defined mutation 
into a new genetic background can confirm that the original phenotype was the result of 

25 the transposition event, rather than a secondary mutation. Furthermore, a well-defined 
mutation can be used to ascertain the presence, if any, of polarity effects. For example, 
the insertion of a transposon into a gene which is part of an operon can have polar effects 
on downstream genes in the operon. One method to determine if a given defect results 
from inactivation of the gene into which a transposon integrated, or if the actual virulence 

30 gene(s) lies downstream of the integration site, is to generate a small, in-frame, non-polar, 
deletion or insertion into a wild type correlate of the gene into which the transposon had 
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integrated. If such a mutant, when tested, for example as described above in the fish 
model, does not exhibit an avirulent phenotype, other genes in the operon can be mutated 
and analyzed in the same manner until one (or more) virulence genes are identified. That 
is, nucleic acid sequences which flank the integrated transposon can be cloned and 
5 sequenced in several sequential steps {e.g., one can "walk" down an operon) until a 
virulence gene is identified. Of course, the invention includes genes which lie downstream 
of a gene in which a polar mutation results in an avirulent phenotype. Such genes can be 
considered to be "genes of the invention" or "genes identified by methods of the 
invention." 

10 As a first step in performing site-specific mutagenesis of a gene of interest, it is 

preferable to isolate {e.g., clone) at least a portion of the corresponding wild type gene. 
If the gene is part of an operon, some, if not all, of the other genes in the operon can also 
be isolated. As used in this application, the term "isolated" (referring, e.g., to a gene or 
gene product, nucleic acid, protein, bacterium, etc.) means being in a non-naturally- 

15 occurring form. Methods to clone genes, particularly those containing a unique marker, 
are routine for one of ordinary skill in the art. (See, e.g., Sambrook, J. et al (1989). 
Molecular Cloning, a Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, NY; Ausubel, F.M. et al (1995). Current Protocols in Molecular Biology, 
N.Y., John Wiley & Sons; Davis et al. (1986), Basic Methods in Molecular Biology, 

20 Elsevir Sciences Publishing, Inc., New York; Hames et al. (1985), Nucleic Acid 
Hybridization, IL Press; Dracopoli, N.C. et al, Current Protocols in Human Genetics, 
John Wiley & Sons, Inc.; and Coligan, J.E. et al., Current Protocols in Protein Science, 
John Wiley & Sons, Inc for many of the molecular biology techniques referred to in this 
application, including isolating, cloning, modifying, labeling, manipulating, sequencing, 

25 and otherwise treating or analyzing nucleic acid and/or protein.). In one method, clones 
comprising a gene(s) of interest can readily be identified and isolated from a wild type 
library {e.g., a cosmid library, Bacterial Artificial Chromosome (BAC) library (Brosch, 
R. et al (1998). Infect. Immun. 66, 2221-2229; Philipp, W.J. et al (1 996). PNAS 22, 3 1 32- 
37), phage library, cDNA library, or the like), using conventional, routine, procedures in 

30 the art. Methods for subcloning a gene(s) of interest are also routine for one of ordinary 
skill in the art. 
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Example 6 describes a preferred embodiment of the invention, in which a 
hybridization probe corresponding to gene sequences flanking the site of transposon 
integration in an M marinum mutant is used to screen a cosmid library of wild type M. 
marinum genes. Because many M marinum genes are about 2 kb in size, and the average 
5 DNA insert in a cosmid library can be about 30-40 kb, it is likely that a cosmid clone so 
identified will contain the entire operon, if any, in which the gene of interest is located. 
It is understood, of course, that the genes and clones referred to in this application 
typically are double-stranded; therefore, a probe "corresponding to" a given sequence can 
be designed to hybridize to either of the strands of the DNA duplex, or to a nucleic acid 
10 (e.g., RNA or cDNA) which is complementary to one strand of the duplex. 

The term "a cloned gene," as used herein, can encompass not only the regions of 
DNA that code for a polypeptide but also regulatory regions of DNA such as regions of 
DNA that regulate transcription, translation and, for some microorganisms, splicing of 
RNA. Thus, a "gene" can include promoters, transcription terminators, ribosome-binding 
1 5 sequences and, for some organisms, introns and splice recognition sites. A cloned "gene" 
as used herein can be, e.g., a genomic or a cDNA gene, or a rRNA or tRNA gene, or the 
like. 

After a gene of interest, or a portion thereof, has been cloned, defined mutation(s) 
can be introduced into it, using methods of site-specific mutagenesis which are well- 

20 known in the art. Any type of mutation, for example those defined above, can be 
introduced into a cloned gene of interest. In a preferred embodiment, a wild type, cloned 
M. marinum virulence gene is mutated such that an insertion or deletion (ranging from 
about 3 bases to about 90% of the entire gene sequence, preferably about 99 to about 4000 
bases, most preferably about 500 bases) is introduced in such a way that the coding 

25 sequences remain in phase (i.e., the insertion or deletion is a multiple of 3 bases). In a 
most preferred embodiment, the mutation is an insertion of a nucleic acid fragment which 
comprises a kanomycin resistance marker. The site of the mutation can be chosen at will, 
but it is preferably in the 5'-terminal half of the gene. The availability of convenient 
restriction sites in the gene can simplify the introduction of mutations. 

30 The mutated DNA can be reintroduced into the M marinum genome by any of a 

variety of well-characterized methods. In a most preferred embodiment, the mutation is 
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introduced into the genome by allelic exchange (homologous recombination). Methods 
for using long linear recombination substrates for allelic exchange in Mycobacteria are 
provided, e.g., in Balasubramanian, V. et al (1996). J. Bacteriol. 12£, 273-279. Other 
methods for homologous recombination are found, e.g., in Aldovini, A.R. et al (1993). J. 
5 Bacteriol. 125, 7282-7289; Norman, E. et al (1995). Mol Microbiol 16, 755-760; 
Baulard, A. et al (1996). J. Bacteriol 12S, 3091-3098; Marklund, B.I. et al (1995). J. 
Bacteriol 122, 6100-6105; Ramakrishnan, L. et al (1997). J. Bacteriol 122, 5862-5868; 
and U.S. Pat. No. 5,700,683. 

Simultaneously with the characterization of a virulence defect in an M. marinum 
mutant, or prior or subsequent to such characterization, the gene which is disrupted by the 
transposon insertion can be identified and characterized. In one embodiment, regions 
flanking one or both sides of an integrated transposon are characterized by hybridization 
to a panel of selected sequences. In a most preferred embodiment, the flanking regions are 
sequenced in order to identify the gene which has been disrupted. Many sequencing 
methods are, of course, well-known to those of ordinary skill in the art. Example 5 
describes two methods to sequence directly the flanking regions, as well as methods to 
first clone and then sequence such regions. In a most preferred embodiment, genomic 
sequences flanking a transposon are amplified using a strategy called ligation-mediated 
PCR (LMPCR) (Prod'hom et al (1998). FEMS Microbiology Letters 15S, 75-81). Briefly, 
this method uses one primer specific for the known sequence^ (IS (insertion sequence) 
present on both ends of the transposon) and a second specific for a synthetic linker ligated 
to restricted genomic DNA. This method is illustrated in Figures 1 1 A and B. The size 
of the flanking regions which can be analyzed are limited by factors such as the fragment 
size that can be amplified by PCR, and can be readily determined by one of skill in the art. 
In a most preferred embodiment, a flanking region is about 100 to about 1 ,000 bases long. 

The comparison of sequences of previously uncharacterized virulence genes in M. 
marinum to sequences in publicly available DNA and protein databases from a variety of 
sources {e.g., GenBank, EMBL, DDBJ, SWISS-PROT, PRF, PDB, RefSeq, etc.) can aid 
in the identification of (functional) homologues, and can add insight into the role a 
30 virulence gene plays in the molecular pathogenesis pathways of mycobacteria in an animal 
host. 
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Optimal alignment of sequences may be conducted by the local homology 
algorithm of Smith and Waterman (1981). Adv. Appl. Math. 2_482; by the homology 
alignment algorithm of Needleman and Wunsch (1970). J. Mol. Biol 48, 443; by the 
search for similarity method of Pearson and Lipman (1988). Proc. Natl Acad. ScL 85, 
5 2444; or by computerized implementations of these algorithms (e.g., GAP, BESTFIT, 
FAST A, and TFASTA in the Wisconsin Genetics Software Package Release 7.O., Genetics 
Computer Group, 575 Science Dr. Madison, Wis.) Other such computer programs 
include, e.g., BLAST and FASTA (Altschul, S.F. et al (1990). J. Mol. Biol. 215, 403-410); 
BLASTX; TBLASTN; Gapped BLAST and PSI-BLAST (Altschul, S.F. et al (1997), 
10 Nucleic Acids Res. 25, 3389-3402). Alternatively, the sequences can be aligned by 
inspection. The best alignment (i.e., resulting in the highest percentage of sequence 
similarity over the comparison window) generated by the various methods is selected. Tn 
a most preferred embodiment, the BLAST blastx program is used. 

Typically, a polynucleotide sequence of interest is translated into all six possible 
15 reading frames and is searched with the NCBI Blast search, selecting blastx. This 
translated sequence is first run against the EMBL data base to identify functional 
homologs. Then, if desired, the sequence is searched with the advanced Blast program, 
against Mycobacterium sequences in particular. In a preferred embodiment, sequences 
identified by such a homology alignment exhibit substantial identity to the sequence of 
20 interest. Of course, any selected degree of sequence identity can be the basis of such a 
comparison, e.g., about 30-50%, about 50-70% or about 70-90% sequence identity at the 
nucleotide or amino acid level. 

The following terms are used to describe the sequence relationships between two 
or more polynucleotides or polypeptides: "reference sequence/' "comparison window," 
25 "sequence identity," "percentage of sequence identity," and "substantial identity." 

A "reference sequence" is a defined sequence used as a basis for a sequence 
comparison; a reference sequence may be a subset of a larger sequence, for example, a 
segment of a full-length cDNA or gene sequence given in a sequence listing, or may 
comprise a complete cDNA or gene sequence. Generally, a reference is at least about 10 
30 nucleotides in length, frequently at least about 20 to 25 nucleotides in length, and often at 
least about 50 nucleotides in length. In a preferred embodiment, a reference sequence is 
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at least about 100 nucleotides in length, frequently at least about 150-300 nucleotides in 
length. Sequence comparisons between two (or more) polynucleotides are typically 
performed by comparing sequences of the two polynucleotides over a "comparison 
window" to identify and compare local regions of sequence similarity. A "comparison 
5 window," as used herein, refers to a segment of at least about 10 contiguous nucleotide 
positions wherein a polynucleotide sequence may be compared to a reference sequence of 
at least about 10 contiguous nucleotides and wherein the portion of the polynucleotide 
sequence in the comparison window may comprise additions and deletions (i.e. gaps) of 
about 20 percent or less as compared to the reference sequence (which does not comprise 
10 additions or deletions) for optimal alignment of the two sequences. 

The term "sequence identity" means that two polynucleotide or polypeptide 
sequences are identical {e.g., on anucleotide-by-nucleotide or amino acid-by-amino acid 
basis) over the window of comparison. The term "percentage of sequence identity" is 
calculated by comparing two optimally aligned sequences over the window of comparison, 
1 5 determining the number of positions at which the identical nucleic acid base (e.g., A, T, 
C, G, U, or I) or amino acid residue occurs in both sequences to yield the number of 
matched positions, dividing the number of matched positions by the total number of 
positions in the window of comparison (Le., the window size), and multiplying the result 
by 100 to yield the percentage of sequence identity. The term "identical" in the context of 
20 two nucleic acid or polypeptide sequences refers to the residues in the two sequences 
which are the same when aligned for maximum correspondence. 

The term "substantial identity" or "substantial similarity" indicates that a nucleic 
acid or polypeptide comprises a sequence that has at least about 90% sequence identity to 
a reference sequence, or preferably at least about 95%, or more preferably at least about 
98% sequence identity to the reference sequence, over a comparison window of at least 
about 10 to about 100 or more nucleotides or amino acid residues. An indication that two 
polypeptide sequences are substantially identical is that one protein is immunologically 
reactive with antibodies raised against the second protein. An indication that two nucleic 
acid sequences are substantially identical is that the polypeptide which the first nucleic 
acids encodes is immunologically cross reactive with the polypeptide encoded by the 
second nucleic acid. 
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Another indication that two nucleic acid sequences are substantially identical is 
that the two molecules hybridize to each other under selected high stringent conditions. 
High stringent conditions are sequence-dependent and will be different with different 
environmental parameters. Generally, high stringent conditions are selected to be about 
5 5°C. to 20°C. lower than the thermal melting point (T OT ) for the specific sequence at a 
defined ionic strength and pH. The T m is the temperature (under defined ionic strength and 
pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. 
Typically, high stringent conditions will be those in which the salt concentration is at least 
about 0.2 molar at pH 7 and the temperature is at least about 60°C. 

1 0 Analyses of the peptides or proteins which can be translated from flanking DN A 

sequences can be particularly informative for identifying functional homologues. The 
similarity between two polypeptides is determined by comparing the amino acid sequence 
and its conserved amino acid substitutes of one polypeptide to the sequence of a second 
polypeptide. Alignment procedures such as those discussed above can be used. 

1 5 The sequencing and characterization of regions flanking thirteen transposons which 

have independently integrated into M marinum, rendering the bacteria avirulent in the 
goldfish model, is shown in Example 9. At least six of the M Marinum mutant genes are 
closely related to a previously identified functional homologue(s) from another organism, 
e.g., a transcriptional regulator from Streptomyces coelicolor which belongs to the AraC 

20 family of transcriptional regulators; an integral membrane protein; polyketide synthase 
genes from Streptomyces and Pseudomonas bacteria; a sulfate adenylyltransferase with 
homology to diverse organisms including Pyrococcus abyssi, Synechocytis.sp., and 
Bacillus subtilis; a cysQ gene, or dhbF from B. subtilis. The possible significance of these 
functional properties for M marinum virulence is discussed in Example 9. 

25 The flanking sequences in M. marinum can also be compared in a similar manner 

to databanks of mycobacteria sequences, using the Advanced Blast search from NCBI and 
selecting Mycobacterium as the genome, and/or the complete sequence of M. tuberculosis 
(Cole, S.T. et al (1998). Nature 393, 537-558), in order to identify virulence genes in other 
mycobacteria. In a most preferred embodiment, this method can be used to identify 

30 virulence genes of M tuberculosis. For example, Example 9 shows that the thirteen M. 
marinum virulence genes examined have functional homologues in M tuberculosis. 
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Methods to clone such Af. tuberculosis homologues are routine in the art. See, e.g., 
Example 7. 

Defined mutations can be introduced into cloned, putative virulence genes of Af. 
tuberculosis genes by methods similar to those discussed above for mutagenizing cloned 
5 Af. marinum genes. The mutations can be made in M. tuberculosis either before or after 
the corresponding mutations in Af. marinum have been characterized. Any of the types of 
mutations described above can be introduced into an Af tuberculosis gene, including 
knockouts of a large portion, including the entire coding sequence, of the gene. In order 
to facilitate the generation of mutants in M tuberculosis, conventional, routine procedures 

10 can be used to identify those regions of the Af. tuberculosis gene which correspond to the 
site of mutation in the corresponding Af. marinum gene. For example, corresponding 
active sites and/or functional domains can be identified by, e.g., comparing the sequences 
or modeling the predicted protein structures. The mutated DNA can then be reintroduced 
into the Af. tuberculosis genome by methods similar to those described above for 

1 5 reintroducing mutations into the Af. marinum genome. Several such methods are described 
in Example 7. In a most preferred embodiment, the defined mutation is reintroduced into 
the M. tuberculosis genome by homologous recombination using a long linear 
recombination substrate. The phenotypic effect of an Af. tuberculosis mutation can be 
determined routinely with one of several available animal models for this organism, 

20 including, e.g., the infection models with guinea pig (Collins, D.M. et al (1995). PNAS 
22, 8036-8040; B. Bloom, ed., (1994). Tuberculosis: Pathogenesis, Protection, and 
Control, ASM Press, Washington, D.C. Chapter 9); mouse and rabbit (B. Bloom, ed., ibid, 
Chapters 8 and 10, respectively); and monkey (Walsh et al (1996). Nature Medicine 2, 
430-436). 

25 The invention encompasses virulence genes (e.g., isolated virulence genes) as 

described elsewhere herein, from Af marinum and/or Af. tuberculosis, which are identified 
by the methods of the invention, and/or variants (eg., naturally- or non-naturally-occurring 
modifications, mutations, polymorphisms, etc.) or fragments thereof. By a "variant" of 
a gene or fragment is meant, as used herein, a replacement, deletion, insertion or other 

30 modification of the gene or fragment. It is preferred that the variant has at least about 70% 
sequence identity, more preferably at least about 85% sequence identity, most preferably 
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at least about 95% or 98% sequence identity with the gene or fragment. The degree of 
similarity can be determined using any of the methods disclosed herein. By a "fragment" 
of a gene is meant a single strand or double stranded nucleic acid (e.g., oligonucleotide) 
of a size smaller than that of the gene, obtained by any of a variety of conventional means, 
5 e.g., digestion with restriction enzymes, PCR amplification, synthesis with an 
oligonucleotide synthesizer, synthesis with a DNA or RNA polymerase, or the like. Such 
fragments can be used, for example, to diagnose the presence of a gene in a sample of 
interest, e.g., by serving as a hybridization probe or a PCR primer. Such diagnostic assays 
can be set up and performed by routine, conventional procedures in the art. In another 
10 embodiment, such fragments can be used to screen for virulent strains of bacteria, eg, 
bacteria which comprise a polynucleotide that encodes a particular virulence gene or a 
fragment thereof. Of course, full-length virulence genes of the invention and variants 
thereof can also be used in diagnostic assays. 

The invention also encompasses polynucleotides which are complementary to a 
1 5 gene of the invention or fragment thereof, or which hybridize to such a gene or fragment 
under selected (eg., high) stringency conditions. For example, the invention encompasses 
an oligonucleotide complementary to a portion of a virulence gene which can be used, eg., 
as an antisense oligonucleotide to regulate expression of the gene, e.g., in a method of 
therapy. Methods to make and use antisense molecules of this type are conventional and 
20 routine, and are presented, eg., in U.S. Pat. Nos. 5,876,931 and 5,585,479 and in 
references cited therein. Similarly, ribozymes comprising such fragments can be used in 
a method of treatment. Methods of making and using ribozymes are also conventional in 
the art. 

Of course, the genes and fragments discussed herein can be any form of 
25 polynucleotide or nucleic acid, eg., naturally occurring, synthetic or intentionally 
manipulated polynucleotides, wherein nucleotide bases or modified bases are linked by 
various known linkages, eg., ester, phosphodiester, sulfamate, sulfamide, 
phosphorothionate, phosphoroamidate, methyl phosphonate, carbamate, or other bonds, 
depending on the desired purpose, eg, resistance to nucleases, such as RNAse H, 
30 improved in vivo stability, etc. Various modifications can be made to nucleic acids, such 
as attaching detectable markers (eg., avidin, biotin, radioactive or fluorescent elements, 



WO 01/19993 



PCT/US00/25512 



-18- 

ligands), or moieties which improve hybridization, detection or stability. The 
polynucleotides can be DNA, cDNA, RNA, PNA, synthetic nucleic acid, modified nucleic 
acid, or mixtures thereof. Polynucleotides can be of any size, e.g., ranging from short 
oligonucleotides to large gene clusters or operons. Either or both strands of a double 
5 strand nucleic acid are included. 

The invention also encompasses peptides or polypeptides encoded by and/or 
expressed from M marinum and/or M. tuberculosis genes identified by the methods of the 
invention, and/or variants or fragments thereof, and products which are generated by such 
peptides or polypeptides. The term "genes identified by the methods of the invention" 
10 encompasses any gene in a given operon, a mutation in one of whose genes results in an 
avirulent phenotype (e.g., the gene can be a downstream gene whose expression is 
diminished or abolished because of an upstream polar mutation, or a gene whose gene 
product interacts with another gene product of the operon, etc.). 

The peptides or polypeptides can be isolated {e.g., purified) from bacteria directly, 
1 5 or they can be expressed recombinantly and isolated (e.g., purified) from recombinant 
organisms. Methods of isolating, purifying and sequencing naturally produced or 
recombinantly produced peptides and polypeptides are conventional and routine in the art. 
The genes can be cloned into any of a variety of expression vectors. The sequences to be 
expressed can be genomic sequences, e.g., subcloned sequences from a cosmid library as 
20 described in Example 6, or they can be corresponding cDNA sequences, obtained by 
conventional means. In some cases, it may be desirable to express a fragment of a gene, 
or more than one gene, e.g., as many as the genes of an entire operon. Vectors and 
appropriate regulatory elements for expressing genes in a variety of cell types or hosts, 
including prokaryotes, yeast, and mammalian, insect and plant cells, and methods of 
25 cloning and expressing genes or gene fragments, are routine in the art and are discussed, 
e.g., in U.S. Pat. Nos. 5,876,931, 5,700,683, 4,440,859, 4,530,901, 4,582,800, 4,677,063, 
4,678,751, 4,704,362, 4,710,463, 4,757,006, 4,766,075 and 4,810,648. 

The invention also encompasses a host transformed to express a peptide or 
polypeptide of the invention, or a host which is mutated so the expression of a peptide or 
30 polypeptide of the invention is disrupted (e.g., inhibited), or progeny of such hosts. 
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"Variants" of the peptides or polypeptides are also included in the invention, e.g., 
insertions, deletions and substitutions, either conservative or non-conservative, where such 
changes do not substantially alter the normal function of the protein. By "conservative 
substitutions" is meant by combinations such as Gly, Ala; Val, He, Leu; Asp, Glu; Asn, 
5 Gin; Ser, Thr; Lys, Arg; and Phe, Tyr. Variants can include, e.g., homologs, muteins and 
mimetics. Many types of protein modifications, including post-translational 
modifications, are included. See, eg., modifications disclosed in U.S. Pat. No. 5,935,835. 

"Fragments" of the peptides or polypeptides are also included in the invention. 
These fragments can be of any length. In a preferred embodiment, a fragment is functional 
10 (e.g., has biological activity, can inhibit or enhance the activity of a protein or other 
substance, contains one or more immunogenic epitopes, etc.). In a most preferred 
embodiment, the fragment contains all or a subset of the amino acids of SEQ ID NOs: 5, 
7, 9, 12, 15, 20, 22, 24, 26, 28, 30, 32-34, 36-38, 40 or 42. 

Among the polypeptides of particular interest are polyketide synthases. Example 
9, for example, shows that an M. marinwn virulence gene identified by the method of the 
invention, and an M tuberculosis homologue of it, appear to be polyketide synthase genes. 
As is well-known, many polyketides have therapeutic value (for human, veterinary, or 
aquaculture uses). For example, polyketides have been shown to function as antibiotics, 
chemotherapeutic agents or immunosuppressive agents, eg., in transplant patients. The 
invention includes the generation and/or isolation (eg., purification) of polyketide 
synthases encoded by virulence genes identified by the method of the invention, as well 
as polyketides produced by those synthases. . The polyketides can be generated by 
recombinant means, isolated from non-recombinant bacteria, or produced synthetically. 
Methods for making, isolating and purifying polyketides are routine and well-known in 
the art. 

Recombinantly expressed polypeptides of the invention can also be used to confirm 
that a particular virulence gene is responsible, at least in part, for a pathogenic phenotype 
in an organism - that is, to confirm Koch's postulates. Example 8 shows how a 
recombinantly expressed M. marinum putative virulence gene can be used to complement 
30 a mutant bacterium which is defective in that gene, and to restore a virulent phenotype in 
fish infected by the complemented mutant. 
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Virulence genes of the invention and peptides thereof can contain antigenic 
epitopes. The invention also encompasses antibodies, including polyclonal or monoclonal 
antibodies, or fragments of polyclonal or monoclonal antibodies, which are generated in 
response to such epitopes. Such antibodies can be used, e.g., in diagnostic assays to detect 
5 the presence of a mycobacterium, to identify virulent strains of bacteria, or in methods to 
treat disease conditions caused or exacerbated by a virulence protein (e.g., passive 
immunization), following routine, art-recognized procedures. 

The invention also encompasses an avirulent mycobacterium, preferably M. 
marinum and/or M. tuberculosis, which harbors one or more mutation(s) in one or more 

10 virulence gene(s) identified by the methods of the invention, or a pharmaceutical 
composition which comprises such a bacterium and a pharmaceutically acceptable carrier. 

In a preferred embodiment, the avirulent bacterium is introduced into a host (e.g., 
a fish, cow or human) in order to elicit an immune response. Because the bacterium is 
avirulent (e.g., attenuated), it is expected to be suitable for administration to a host in need 

1 5 of treatment, but it is also expected to be antigenic and to give rise to an immune response, 
preferably a protective immune response. For such a use, it is preferred that the mutation 
is substantially non-revertable, e.g., a deletion or frame-shift mutation. To ensure non- 
revertability, it is preferable that a bacterium comprises at least two or three such 
mutations, preferably in different genes. A small deletion mutant would be expected to 

20 provide antigenic epitopes in the portion of the protein which lies downstream of the 
deletion, even though the protein, itself, is not functional with respect to virulence. 

Another embodiment of the invention is a vaccine comprising a suitable avirulent 
mycobacterium of the invention and a pharmaceutically acceptable carrier. By vaccine is 
meant an agent used to stimulate the immune system of a living-organism so that 

25 protection against future harm is provided. Immunization refers to the process of inducing 
an antibody and/or cellular immune response in which T-lymphocytes can either kill the 
pathogen and/or activate other cells (e.g., phagocytes) to do so in an organism, which is 
directed against a pathogen or antigen to which the organism has been previously exposed. 
The term "immune response," as used herein, encompasses, for example, mechanisms by 

30 which a multi-cellular organism produces antibodies against an antigenic material which 
invades the cells of the organism or the extra-cellular fluid of the organism. The antibody 
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so produced may belong to any of the immunological classes, such as immunoglobulins 
A,D,E,G or M. Other types of responses, for example cellular and humoral immunity, are 
also included. Immune response to antigens is well studied and widely reported. A survey 
of immunology is given e.g., in Roitt L, (1994). Essential Immunology, Blackwell 
5 Scientific Publications, London. Methods in immunology are routine and conventional 
(see, e.g., in Current Protocols in Immunology', Edited by John E. Coligan et al., John 
Wiley & Sons, Inc.). 

Methods of formulating, testing, optimizing and administering vaccines of the 
invention are routine and conventional, and are described, e.g., in U.S. Pat. Nos. 

10 5,876,931, 5.700,683, and references cited therein, and in "New Generation Vaccines, 
edited by M.M. Levine et al, 2nd edition, Marcel Dekker, Inc., New York, NY, 1997." 
Active immunization of a patient (e.g., human, fish, cow, etc.) is preferred. In this 
approach, one or more mutant bacteria are prepared in an immunogenic formulation 
containing suitable adjuvants and carriers and administered to the patient in known ways. 

1 5 Suitable adjuvants include Freund's complete or incomplete adjuvant, muramyl dipeptide, 
the "Iscoms" of EP 109 942, EP 180 564 and EP 231 039, aluminum hydroxide, saponin, 
DEAE-dextran, neutral oils (such as miglyol), vegetable oils (such as arachis oil), 
liposomes, Pluronic polyols or the Ribi adjuvant system (see, for example GB-A-2 189 
141). "Pluronic" is a Registered Trade Mark. The patient to be immunized is a patient 

20 requiring to be protected from the disease caused by, or exacerbated by, the virulent form 
of the bacterium. 

The aforementioned avirulent bacteria of the invention or a formulation thereof 
may be administered by any conventional method including oral and parenteral (e.g., 
subcutaneous or intramuscular) injection. The treatment may consist of a single dose or 

25 a plurality of doses over a period of time. While it is possible for an avirulent bacterium 
of the invention to be administered alone, it is preferable to present it as a pharamaceutical 
formulation, together with one or more acceptable carriers. The carriers) must be 
"acceptable" in the sense of being compatible with the avirulent microorganism of the 
invention and not deleterious to the recipients thereof. Typically, the carriers will be water 

30 or saline which will be sterile and pyrogen free. 
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It will be appreciated that a vaccine of the invention, depending on its bacterial 
component, may be useful in the fields of human medicine, veterinary medicine, or 
aquaculture. A vaccine for fish against Mycobacterium marinum could be of particularly 
significant economic importance. Mycobacterium marinum causes tuberculosis in more 
5 than 150 species of both salt-water and fresh-water fish, among them salmonid trout 
(salmo gairdneri, salmo trutta t oncorhynchos mykiss), striped bass, tilapia, etc. 
Aquaculture facilities infected with AT. marinum suffer from a constant mortality rate over 
a long period of time accompanied by severe economic losses, which could be ameliorated 
with such a vaccine. A vaccine against M. tuberculosis could, of course, be a significant 
10 weapon in the battle against tuberculosis, which is wide-spread in human populations. 

Vaccines encompassed by the invention also include killed bacterial vaccines; 
subunit vaccines comprising a virulence protein(s) of the invention (e.g., a wild type or 
mutant protein(s), or a variant(s) thereof), or an antigenic fragment(s) thereof; bacteria 
which produce or are capable of producing such virulence proteins or fragments; and DN A 
1 5 vaccines comprising a nucleic acid which encodes such a virulence protein or fragment 
thereof. Methods of making and using such vaccines are routine and conventional in the 
art. For methods of making and using DNA vaccines, see, e.g., U.S. Pat. No. 5,589,466. 

An avirulent bacterium of the invention can also be used as a "carrier" for the 
expression of one or more cloned heterologous gene(s) or fragments thereof. For example, 

20 an avirulent M marinum organism can be used to express a secreted or surface-expressed 
heterologous peptide or polypeptide in fish, and an avirulent M. tuberculosis organism can 
be so used in humans. The avirulent bacterium can be used to express, e.g., an allergen, 
or an antigenic epitope from another pathogen, for which the modified bacterium can act 
as a vaccine. In a preferred embodiment, the heterologous gene is inserted at or near the 

25 position at which the transposon was inserted in an avirulent mutant, or at or near the site 
of the more * 4 well-defined" avirulent mutation. Methods to clone heterologous genes are 
routine, as are methods to express them in a host. Methods of making and using such 
carriers are disclosed, e.g., in U.S. Pat. Nos. 5,876,931 and 5,424,065. 
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The invention also encompasses a method for identifying an agent which reduces 
the ability of a microorganism to survive in a host, e.g., an anti-mycobacterial agent which 
inhibits expression of a virulence gene, or which attacks products produced directly or 
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indirectly by a virulence gene. In a preferred embodiment, such an agent can be used to 
treat a disease caused by, or exacerbated by, a virulence gene of the invention. One such 
method, as disclosed, eg., in U.S. Pat. No. 5,876,931, is to generate a bacterium which 
over-expresses the virulence gene, and then to identify an agent which reduces the viability 
5 or growth of a wild type cell but not the cell overexpressing the gene, in a host. Methods 
to generate the over-expressing strain, and to perform such screening procedures, are 
routine and are described, e.g., in U.S. Pat. No. 5,876,931. Other methods to screen for 
anti-mycobacterial drugs are routine and are described, e.g., in U.S. Pat. No. 5,700,683. 

The invention also relates to a method of screening vaccine candidates for human 
10 tuberculosis in the fish model. In one embodiment, based on the assumption that M 
marinum bacteria may be suitable for human vaccines, goldfish can be inoculated with an 
Af. marinum vaccine candidate of interest. The fish are then challenged with fully virulent 
Af. marinum at a dose capable of establishing disease. A vaccine which, when inoculated 
into a fish, protects the fish from subsequent virulent challenge by the fish failing to 
15 develop disease symptoms is a candidate for a human vaccine. In another embodiment, 
a putative virulence gene of Af. tuberculosis is selected, and a mutation is made in the Af. 
marinum homologue of that gene. The mutant Af. marinum is then tested as a vaccine 
candidate, using the goldfish model as above. 

Brief Description of the Figures 

20 Fig. 1 shows the median survival time (MST) offish inoculated with Af. marinum. 

The median survival time of fish (days) inoculated with Af. marinum at doses indicated per 
fish is compared to a phosphate buffered saline (PBS) control. *survival to endpoint of 
experiment, 56 days. 

Fig. 2 shows a comparison of the growth of Af. marinum in liver, spleen and 
25 kidney. The inoculum is 10 7 CFU/fish. Results are given as geometric means ± standard 
error for eight fish per time point. 

Fig. 3 shows a comparison of mean cumulative granuloma scores (MCGs) over 
time offish infected with 10 7 CFU of Af. marinum organisms. The results are given as a 
vertical box plot, with horizontal lines marking the median 10 th , 25 ,h , 50 lh , 75 lh and 95 lh 
30 percentile points of GSs for eight animals at each time point. The mean of each group is 
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represented by a thick line. At 2 weeks, the median 50 th percentile and mean values are 
the same. 

Fig. 4 shows a survival curve of goldfish inoculated with 10 8 CFU of Af. marinum 
1218R (wild type) or 1218S (mutant). 

Fig. 5 shows the modification of pYUB285 with transposon tags. Bg is Bglll; 
Bam is BamHl; H is HindlU; IR are inverted repeats which mark the boundaries of the 
transposon; ORFR and ORFA are transposon genes; aph is the gene for kanomycin 
resistance; oriE is the E. coli ori\ and AoriM is the disabled mycobacterial ori. 

Fig. 6 shows the construction of an Af. marinum signature-tagged mutant library. 

Fig. 7 shows a schematic diagram of an Af. marinum mutant library screen in the 
goldfish model. 

Fig. 8 shows a survival curve of Af. marinum mutant 41 .2. 

Fig. 9 shows a survival curve of M. marinum mutant 80.1 . 

Fig. 10 shows a survival curve of M marinum mutant 86.1. 

Figs. 11 A and B illustrate ligation-mediated PCR. 

Fig. 12 shows Competitive Indices ofM marinum mutants 32.2, 60.2, 62.2, 67.1, 
80.1,86.1,42.2, 80.8 and 68.6. 

Fig. 13 shows a survival curve of Af. marinum mutant 67.1. 

Fig. 14 shows a survival curve of Af. marinum mutant 39.2. 

Fig. 15 shows a survival curve of Af. marinum mutant 42.2. 

Examples 

Example 1, Properties of the M marinum/goWish model 
A. Median Survival Time and LD». 



WO 01/19993 



PCT/US00/25512 



-25- 

To determine the median survival time of goldfish after inoculation with M. 
marinum strain ATCC 927, groups of 20 to 32 fish were inoculated intraperitoneally with 
10 9 , 10 8 or 10 7 colony forming units (CFU). The median survival time of goldfish 
inoculated with Af: marinum was dose dependent, with survival time decreasing with 
5 increasing doses of bacteria. The median survival time offish was 4, 10, and >56 days 
(the endpoint of the experiment) with inocula of 10 9 , 10 8 , or 10 7 M. marinum organisms, 
respectively. All fish inoculated with 10 7 CFU or less survived to the end point of the 
experiment (56 days). The control fish group, inoculated with PBS in 5 separate 
experiments, had a total of two premature deaths, one at 8 and one at 19 days post- 
1 0 inoculation, from a total of 55 fish. The remainder of the control fish survived to 56 days, 
the endpoint of the experiment (See Figure 1). The LDjo at 1 week postinfection with Af. 
marinum was 4.5 x 10 8 (calculated by the method of Reed & Muench, 1938. Am. J. Hyg. 
22, 493-497). 

B. Mycobacterial recovery from fish organs. 

15 To assess the ability of Af. marinum to persist in goldfish tissue, the liver, spleen, 

and kidneys from each sacrificed fish were collected for bacteriological examination. Af. 
marinum was recovered from all organs of fish in the 10 9 or 10 8 CFU inoculum groups. 
In fish inoculated with 10 7 CFU, Af. marinum was recovered from 96% of the examined 
organs. 

20 The fate over an 8 week period of the Af. marinum ATCC 927 strain in the livers, 

spleens, and kidneys offish inoculated with 10 7 CFU was followed. (See Figure 2). There 
was a significant positive linear relationship between time postinoculation and colony 
recovery in the liver (P O.001); for the spleen and kidneys, the relationship was positive 
but did not reach statistical significance (P = 0.054 and P = 0.091 , respectively). Between 

25 8 and 16 weeks postinoculation, Af. marinum persisted in the tissue with no significant 
change in the colony counts. In addition, in the 10 2 to 10 6 CFU inoculum groups, Af 
marinum was isolated from at least one organ from all infected fish. 
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C An acute and chronic form of mycobacterial infection. 

The pathology of infected fish was dependent on the inoculum dose and the time 
postinfection of animal sacrifice. Fish infected with either 10 9 or 1 0 8 CFU of M. marinum 

organisms suffered from anorexia, sluggish movement, and loss of equilibrium. 

5 The histopathology of fish infected with 10 9 and 10 8 CFU was characterized by 

severe peritonitis and necrosis as compared to control fish. The peritoneum was filled 
with inflammatory cells consisting of lymphocytes, macrophages, fibrous connective cells 
as well as with degenerating cells and bacteria. The mean cumulative granuloma score 
(MCGS) for these 2 groups was similar (0.2 for the 10 9 CFU group and 0.9 for the 10 8 
10 CFU group). In the 10 8 CFU inoculum group, granuloma formation was more likely to 
be found in animals which survived more than 2 weeks postinoculation. 

When examined at 2 weeks, 6 of 8 fish in the 10 7 CFU group had moderate to 
severe peritonitis. Unlike the 10 8 and 10 9 CFU inoculum groups which succumbed to 
infection, the 10 7 CFU inoculum group survived the infection, and by 4 to 6 weeks 
1 5 postinoculation, the acute peritoneal inflammation was replaced by a chronic inflammatory 
state. Fish inoculated with 10 7 CFU demonstrated granuloma formation in all organs 
evaluated (MCGS of 5.0), including the peritoneum and pancreas, liver (e.g., onion ring 
granuloma composed of epithelioid macrophages surrounding a necrotic center), spleen, 
trunk kidney, head kidney, heart and intestine. Pleomorphic granulomas (necrotizing, non- 
20 necrotizing and caseous) were seen. The necrotizing granulomas were characterized by 
a central area of necrosis surrounded by macrophages, epithelioid cells, and thin fibrous 
connective tissue. Frequently, caseous necrosis was present in the central area of the 
granuloma. Granulomas containing foamy macrophages were also seen. Occasionally, 
Langhans and foreign body type giant cells were observed. In addition, acid fast bacilli 
25 could be demonstrated with the modified Ziehl-Neelsen stain. Melanomacrophage centers 
were seen in a few cases. 

The chronic inflammatory response of fish towards M. marinum was time 
dependent, as seen by the increment in mean cumulative granuloma scores (MCGSs) with 
time in animals inoculated with 10 7 CFU (See Figure 3) up to 8 weeks. From 8 to 16 
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weeks postinoculation, there was no significant change in MCGSs (5.0 and 5.7 
respectively). 

D. Minimum infectious dose (MID). 

To estimate the lowest possible dose of Af. marinum able to establish infection in 
5 goldfish, groups of four fish were inoculated with Af. marinum ATCC 927 at doses of 1 0 6 , 
10 5 , 10 4 , and 10 2 CFU. Granuloma formation was seen in 25% of the goldfish by 4 weeks 
and in 88% by 8 weeks postinfection with a dose of 6.3 x 10 2 CFU or higher (Table 1). 
The minimum number of organisms required to establish infection in goldfish appears to 
be approximately 600 CFU. 



Table 1. MID of M. marinum ATCC 927 



Inoculum (CFU/fisb) 


No. positive" 




MCGS 




4Wk 


8Wk 




1.2 x 10 6 


1/2 


1/2 


5.0 


3.0 x 10 5 


0/2 


2/2 


5.5 


2.4 x 10" 


1 12 


2/2 


1.5 


6.3 x 10 2 


0/2 


2/2 


4.5 



a Number of granuloma-positive animals per total number of animals 
at 4 and 8 weeks postinoculation. 



Mycobacterial virulence assay. 

The relative virulence of different strains of Af. marinum, isolated from both human 
and animal origin, was assessed. Three mycobacterial strains, Af. marinum ATCC 927, 
15 M and F-l 10, were inoculated into goldfish at 10 8 CFU. The median survival times of Af. 
marinum M, ATCC 927, and F-l 10 were similar, ranging from 4 to 10 days. 
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Examnle 2 - Differentiation of an avimlent M marinum mutant from the wild type in the 
goldfish model 

The goldfish model can differentiate between virulent and avirulent M. marinum 
organisms. A comparison of such a pair of strains is shown in Figure 4. The M marinum 
5 strains designated 1218R (wild type, aka ATCC 927) and 1218S (avirulent mutant) were 
inoculated into groups of 5 to 9 goldfish in two separate experiments at an inoculum dose 
of 1 .4 to 4 x 10 8 CFU. The median survival time of goldfish inoculated with M. marinum 
1218R organisms was 3 days compared to 28 days (endpoint of experiment) with M. 
marinum 1218S organisms (See Figure 4). The mutant 1218S also failed to persist in the 
1 0 mouse macrophage model. This experiment shows that the fish mycobacteriosis model 
can allow the identification of Af. marinum virulence genes. 

Example 3 - Signature-tagged mutagenesis, and the generation of a library 

A. Construction of a master bank of signature-tagged transposons 

As an initial step in creating a bank of signature-tagged transposons, plasmid 
1 5 pAT30 is generated (see Figure 5). A unique restriction site (BglU) is introduced into the 
mycobacterial transposon delivery vector pYUB285 between ORFA and aph. The vector 
is a suicide vector in mycobacteria because of inactivation of the mycobacterial origin of 
replication by an internal deletion. A kanamycin resistance gene (aph) inserted into 
IS/ 096 allows for a library of insertions in the mycobacterial genome to be generated upon 
20 electroporation of the plasmid followed by selection for kanamycin. 

To generate a collection of signature tagged transposons to be inserted into pAT30, 
primers P5 (5'-CTAGGTACCTACAACCTC-3') (SEQ ID NO: 1) and P3 (5'- 
C ATGGTACCCATTCTAAC-3 ') (SEQ ID NO: 2) and the template RT1 oligonucleotide 
(5'-CTAGGTACCTACAACCTCAAGCTT-[NK] 20 
25 AAGCTTGGTTAGAATGGGTACCATG-3 ') (SEQ ID NO: 3) are prepared by 
conventional, routine methods, preferably using a commercially available oligonucleotide 
synthesizer. The 5* ends of primers P5 and P3 have BamHI sites. The template RT1 
oligonucleotide is similar to that designed by Hensel et aL, with a variable central region 
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(NK^o flanked by arms of invariant sequences. The invariant arms allow the sequence 
tags to be amplified in a PCR with the use of primers P3 and P5. The variable region is 
designed to ensure that the same sequence occurs only about once in 2 x 1 0 17 molecules. 
PCR is performed, using standard, routine methods (see, eg., Innis, M.A. et al, eds. PCR 
5 Protocols: a guide to methods and applications, 1 990, Academic Press, San Diego, CA) 
to generate and amplify double stranded, 90 bp signature tags. The PCR amplified tags are 
digested with BamUl, gel purified, and then ligated to the BgRl digested, 
dephosphorylated (calf intestinal phosphatase, New England BioLabs, Inc.) pAT30 
plasmid. E. coli DH5a is transformed with this ligation mixture and plasmids from 800 

10 individual clones are isolated, arrayed in 96 well microtiter plates, and transferred to nylon 
membranes. These plasmids are analyzed for hybridization and tag amplification 
efficiency. In this example, ninety-six plasmids that are hybridization and amplification 
efficient are chosen for the master plasmid collection. The master plasmids are screened 
for cross hybridization with other plasmids in the master plasmid collection and any cross- 

1 5 hybridizing plasmids are eliminated until the collection has no cross hybridizing members. 
Of course, a master plasmid collection of any size can be constructed by this method. 
Methods for carrying out STM mutagenesis and isolating bacterial virulence mutants are 
described, e.g., in Hensel et al (1995). Science 269, 400-403 and U.S. Pat. No. 5,876,931 . 

B. Optimization and initial characterization of M. marinum transposition 

20 Several protocols for the preparation of competent cells from M. marinum are 

evaluated. The strains tested are ATCC 927 (fish isolate) and M. marinum strain M 
(human isolate). Electrocompetent cells are prepared from M. marinum cells grown to 
different growth phases at different temperatures in the presence of ethionamide or 
cycloheximide. Mycobacterial cells are transformed by electroporation with the 

25 replicative Escherichia coli- mycobacteria shuttle vector, pYUB18 (Jacobs, W.R. et al 
(1991). Methods Enzymol 204, 537-555), as well as the suicide vectors pYUB285 
(McAdam R.A. et al (1995). Infect. Immun. 61, 1004-1012) and pUS252, carrying the 
transposable elements, \S1096 and IS6110, respectively (Dale, J.W. (1995). Eur. Respir. 
J. S, 633s-648s). Mutants of M marinum are recovered on 7H10 agar plates supplemented 

30 with kanamycin. Transformation and transposition efficiencies under different protocols 
are compared, using routine, art-recognized procedures. See, e.g., McAdam et al (1995). 
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Infec. Jmmun. fil, 1004-1012 and Cirillo, J. D. et al (1991). J. Bacteriol. 122, 7772-7780. 
Southern hybridization analysis is performed on mycobacterial mutants to confirm the 
transposition events. These analyses show that: 1) competent cells prepared at room 
temperature from late-exponential growth phase organisms yield a higher transposition 
5 efficiency than cells prepared at 4°C or from early-or mid-exponential growth phase 
organisms; 2) the highest efficiency for transposition is 10 2 -10 3 cfu per ng of plasmid 
DNA; and 3) the IS/096-derived transposon is best able to efficiently mutagenize M. 
marinum. 

To confirm that M /wari/wm-kanamycin resistant colonies are not spontaneous 
10 mutants, colonies recovered after electroporation with the non-integrating, replicative 
vector, pYUB18, are analyzed; the plasmid pYUB18 is successfully isolated from 6 
separate transformants and is identified by restriction enzyme mapping. This indicates that 
the transformants are not spontaneous mutants. In another experiment, 35 randomly 
selected mutants recovered from electroporation of the suicide vector, pYUB285 are 
15 examined by Southern analysis to determine whether transposition is random in the M 
marinum chromosome. All tested transposon mutants yield a single band, located in a 
different position on the Southern blot, consistent with random integration of a single copy 
of IS/ 096 into theM marinum genome. Evaluation of 10 mutants obtained in a single 
electroporation experiment shows that each mutant is inserted into a different part of the 
20 Af. marinum genome, indicating that the mutants from a given electroporation do not 
represent siblings. 

C. Generation of an Af. marinum mutant library 

An M marinum mutant library is generated by electroporating individual members 
of the 96 master plasmid collection into Af. marinum bacteria (See Figure 6). M. marinum 

25 electrocompetent cells are prepared from a 100 ml culture grown to late exponential phase 
(O.D.60Q =1.6 to 1.8). Bacteria are washed three times at room temperature with 10% 
glycerol and then suspended in 1 ml 10% glycerol and distributed to 0.2 cm gap 
electroporation cuvettes (Bio-Rad Laboratories). Electroporation is performed at room 
temperature using a Gene Pulser (Bio-Rad Laboratories) with parameters of 2.5 kV, 25 |iF, 

30 and 800 Q. Electroporated cells are rescued by growth overnight in 7H9 broth with 1 0% 
albumin-dextrose complex enrichment (ADC) (52) at 30°C and plated on 7H10 agar with 
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kanamycin (20^g/ml) and incubated at 30°C. Mutants appear 1 to 2 weeks after plating. 
Mutants from each electroporation are named for the master plasmid used for transposon 
delivery (pAT30-l plasmid yields mutants 1.1, 1.2, etc.). In this example, 960 mutants are 
isolated, 1 0 mutants per master plasmid. Of course, more mutants can be isolated per each 
5 master plasmid, and the 96 (or additional) master plasmids can be used to generate 
additional mutants. 



Example 4 - Screening an M. marinum library for potenti al avirulent mutants, using the 
goldfish model 

A. Screening for mutants which show reduced viability in the goldfish host 

10 The M. marinum library obtained in Example 3 is screened for mutants which 

exhibit a reduced ability to survive in the goldfish model. The library of M. marinum 
transposon-tagged mutants is screened in pools; in this example, each pool has 48 mutants 
(See Figure 7). Each of the mutants in a given pool is marked with a unique DNA tag (i.e. 
they are derived from 48 of the 96 master plasmids). To generate an input pool, mutants 

15 that make up the pool are grown in individual wells of a 96-well microtiter plate 
containing 7H9 broth with ADC and kanamycin (20ng/ml) at 30°C until they reach 
0 0.600= 0.6-0.8. The mutants are then pooled and an aliquot is removed for amplification 
using colony PCR (input pool probe). The remaining pooled bacterial cells are 
centrifuged, resuspended in phosphate buffered saline (PBS) to an inoculum dose of about 

20 2xl0 7 cfu/ml, sonicated for 3 minutes, and injected into three fish. The fish are sacrificed 
at 7 days postinoculation and spleen, liver and kidney are harvested. The mutants that 
have reached and multiplied within these organs are recovered by plating homogenates of 
the organs onto laboratory medium. The recovered mutants from a given organ are 
combined and an aliquot is used for amplification using colony PCR (output pool probe). 

25 The products of the input and output pool amplification are used in a second PCR 
amplification using oc- 32 P dCTP to generate two radiolabeled probes. The amplified probes 
consist of a central variable region (the unique DNA tag) flanked by arms of invariable 
sequences which permit amplification of any tag using a defined set of primers. The arms 
are released by digestion with Hind III and the radiolabeled tags are used to probe replicate 
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membranes from the master plasmid collection. Because of the complex structure of the 
mycobacterial cell wall and difficulties encountered in mycobacterial colony hybridization, 
in this example the amplified tags are used as probes to a dot blot containing the master 
plasmid collection. Hybridization to other forms of the master plasmid collection can, of 
5 course, be used. Tags from mutants that hybridize to the probe from the input pool (Figure 
7, membrane 1) but not to the probe from the output pool (Figure 7, membrane 2) 
represent mutants which are unable to survive or compete in the fish model. Such mutants 
are designated as potential virulence mutants. 

The pools of mutants recovered from different organs are kept separate, in order 
10 to characterize virulence mutants with regard to the organs examined. In some cases, 
mutations necessary for survival at different points in the pathogenesis of this organism 
can be identified, since the mechanisms necessary for survival in liver, spleen and kidney, 
or in other organs, may differ. The pools of mutants recovered form different fish are also 
kept separate. Mutants from two fish are used independently to produce an output pool 
15 probe and are independently hybridized to replica membranes to confirm reproducible 
identification of potential virulence mutants from a given experiment. 

B. Confirming that the mutants are avirulent by examining individual mutants in 
the goldfish model 

M marinum transposon mutants that reproducibly hybridize to the input pool probe 
20 but not to the output pool probe are examined individually in the goldfish model. An 
inoculum dose of 10 8 bacteria in 0.5 ml per fish is used to inoculate 3 fish per mutant. A 
control group of fish is simultaneously inoculated with Af. marinum ATCC 927 (wild type) 
at the same dose as the mutants and with PBS as a negative control. The median survival 
time (MST) of goldfish inoculated with the wild type at this dose is 1 0 days. If the MST 
25 for a given mutant is greater than that of the wild type, this confirms that the mutant may 
have the transposon inserted into a virulence gene. When a mutant-inoculated fish 
survives for 35 days, it is sacrificed and examined for histopathology; and portions of the 
liver, spleen and kidney are homogenized and plated for colony counts. These mutants are 
then inoculated into fish to determine the LD 50 . Three fish per mutant per dose are injected 
30 with 10 8 , 5 x 10 7 , or 10 7 CFU bacteria. The LD 50 for each mutant is evaluated at 1 week 
postinoculation and calculated by the method of Reed and Meunch (1938. Am. J. Hyg. 22, 
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493-497). The LD 50 at 1 week for the wild type strain is 4.5 x 10 8 CFU bacteria per fish. 
The LDjo, Competitive Index, and/or pathology for each mutant is compared to that of the 
wild type strain. 

Competitive index: The competitive index may be used as a measure of the 
5 attenuation of a mutant with respect to a wild type strain. Mutant and wild type strains are 
mixed together in the inoculum. Animals are inoculated with the mixture and 2 weeks 
post-inoculation the animals are sacrificed. The liver of the animal is removed, 
homogenized, and the colony counts in the tissue are determined for both the mutant and 
wild type strains. The two strains are distinguished because the mutant is kanamycin 
10 resistant while the wild type is kanamycin sensitive. Mathematically, the competitive 
index is defined as the output ratio of mutant to wild type bacteria, divided by the input 
ratio of mutant to wild type bacteria. A mutant which has full virulence with respect to 
the wild type should not be out competed by the wild type and the competitive index 
should be 1.0. 

1 5 Histopathology examinations: Portions of the liver, spleen and kidney along with 

peritoneum, heart, pancreas, or other organs evident to one of skill in the art, are fixed in 
10% neutral buffered formalin for routine embedding in paraffin. Five jim thin sections 
of the paraffin fixed tissues are prepared with a rotary microtome (American Optical, 
Buffalo, NY). After dewaxing, the sections are stained for acid fast bacilli with modified 

20 basic fuchsin stain and counterstained with methylene blue or stained with hematoxylin 
and eosin. 

Colony counts in organ homogenates or the ability to induce granuloma formation: 
These parameters can identify virulence defects which are more subtle than one which 
causes the MST to change. Mutants identified in the screening protocol as failing to 

25 survive in vivo, but which fail to cause a significant change from wild type in MST when 
inoculated individually in fish, are further examined. For these experiments, an inoculum 
dose of 10 7 CFU organisms are used, and animals are sacrificed at 4 and 8 weeks 
postinoculation. The liver, spleen, kidney, and/or other organs which are evident to one 
of skill in the art are harvested; one portion is homogenized for analysis of colony counts 

30 and another portion for histopathology. 
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Example 5 - Sequencing and characterizing regions flank ing the transposon* in the 
virulence mutants 

Individual mutants confirmed in the goldfish model to be virulence mutants are 
examined by sequencing the nucleic acid flanking the site of insertion of the transposon. 
5 The sequence analysis can, of course, be performed before, simultaneously with, or after, 
a virulence defect has been confirmed. 

A. Direct sequencing of flanking regions 

In a most preferred embodiment, chromosomal DNA is isolated from each mutant 
and cut with a restriction enzyme that cuts once within the transposon (in this example, 

10 with BamHl). Linkers bearing a predefined PCR primer site, designed and generated 
using routine, art-recognized methods, are ligated to the BamHl -cut ends; and PCR 
fragments are amplified, using as primers a first outward primer sequence specific for a 
portion of the transposon, and a second inward primer specific for the PCR primer site in 
the appended linker, to generate an "amplified PCR fragment". In this example, a 

15 transposon-specific primer sequence is chosen based on the sequence of the inserted 
transposon, IS 1096. By "specific for," as used herein, is meant that a primer (e.g., the first 
outward primer) is sufficiently complementary to a target (e.g., the transposon) to bind to 
it (hybridize; serve as a PCR primer) under selected high stringent conditions, but not to 
bind to other, unintended, nucleic acids. Southern analysis, in which the membrane to 

20 which the DNA has been transferred is probed with an a- 32 P labeled aph (kanamycin 
resistance) gene, can be used to identify the size of the "amplified PCR fragment" from 
each mutant. For example, mutants 41.2, 80.1 and 86.1 shown in Example 9 have unique 
amplified PCR fragments, of 550, 200 and 600 bp, respectively. The amplified PCR 
fragments are sequenced directly, using as primers one or both of the primers used to 

25 generate them, or are cloned into a vector such as pGEM and sequenced using primers 
corresponding to vector sequences. Methods for probing gels and sequencing DNA are 
routine and conventional in the art. 

In another embodiment, the chromosomal DNA is cut with an enzyme which does 
not cut within the transposon. A variety of enzymes can be tested until one which 
30 generates a DNA fragment of an appropriate size is identified. Here, Kpn I is used. The 
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DNA is then ligated to create circular species and amplified by PCR using outward- facing 
primers complementary to the two ends of the transposon. In this way, the sequences 
which flank the insertion are amplified. These fragments are directly sequenced, using the 
same primers used to amplify the sequence. 

5 B. Cloning and then sequencing flanking regions 

In another embodiment, the gene sequences interrupted by a transposon are cloned 
first and then sequenced. Procedures for the analysis of DNA, including isolating DNA, 
cloning it, manipulating it, and sequencing it, are routine and well-known in the art. In a 
preferred embodiment, genomic DNA is extracted from each virulence mutant, and is 

10 digested with one or more restriction enzymes (e.g., in this example, Kpnl or BamWl) that 
provide genomic fragments of an appropriate size for cloning. The digested DNA is 
cloned into an appropriate plasmid, e.g., Bluescript II KS (Promega), or a low-copy 
plasmid such as pACYC184, in E. coli DH5a, by using an appropriate positive selection 
marker (e.g. , kanamycin resistance). Kpnl does not cut within the transposon, so digestion 

15 with Kpn I, followed by selection with kanamycin, results in cloning of the transposon 
along with flanking DNA. Bam HI cuts once within the transposon, so digestion with Bam 
HI, followed by selection with kanamycin, results in cloning of part of the transposon 
along with flanking DNA on one side of the transposon. Once cloned, the gene sequence 
interrupted (disrupted) by the transposon is determined by using outward primers based 

20 on the sequence of the transposon insertion sequence, in this example, IS 1096 (See, e.g. , 
McAdam et al (1995). Infec. Immun. 62, 1004-1012). 

C. Comparison of flanking sequences to known databases 

DNA sequences flanking each transposon (localized on one or on both sides of the 
site of transposon insertion) are compared with the use of the BLAST programs provided 
25 in the National Center for Biotechnology Information (NCBI) data base. 

In order to identify M tuberculosis homologues of M. marinum virulence genes, 
the flanking sequences are also compared to the Mycobacterium database, using the 
advanced Blast search program, as above. 
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A discussion of functional homologues and related virulence genes from M 
tuberculosis which have been identified for 3 M marinum mutants is presented in 
Example 9. 

Exa mnle 6 - Isolating and characterizing wild type M marinum genes which correspond 
5 to the gene s d i srupted by tran s p o sons in a vir u l ent M Marinum mutants 

Probes based on flanking M marinum DNA sequences, characterized, e.g.. as in 
Example 5, are generated and used to screen an M. marinum cosmid library (The 
construction of such a cosmid library is described below). For example, part or all of the 
"amplified PCR fragment" which is described in Example 5 is labeled and used as a 

10 hybridization probe. Conditions for specifically hybridizing a probe to a target nucleic 
acid (e.g., cosmid DNA) can be determined routinely by known methods in the art (see, 
e.g., Nucleic Acid Hybridization, a Practical Approach, B.D. Hames and S.J. Higgins, 
eds., IRL Press, Washington, 1985). It is preferred that hybridization probing is done 
under selected high stringent conditions to ensure that the gene, and not a relative, is 

15 obtained. Of course, conditions of any stringency can be employed. By "high stringent" 
is meant that the gene hybridizes to the probe (e.g., when the gene is immobilized on a 
filter) and the probe (which in this case is preferably about >200 nucleotides in length) is, 
e.g., in solution, and the immobilized gene/hybridized probe is washed in 0.1 X SSC at 65° 
C. for 10 minutes. SSC is 0.15M NaCl/0.015M Na citrate. In general, "high stringent 

20 hybridization conditions" are used which allow hybridization only if there are about 1 0% 
or fewer base pair mismatches. As used herein, "high stringent hybridization conditions" 
means any conditions in which hybridization will occur when there is at least 95%, 
preferably about 97 to 100%, nucleotide complementarity (identity) between the nucleic 
acids. The corresponding cosmid is identified; and individual virulence genes are 

25 subcloned from the cosmid clone, using routine, conventional procedures in the art. The 
complete gene sequence is determined by routine, conventional methods. 

Construction of an M. marinum cosmid library : An M. marinum genomic library 
in an E. coli - Mycobacteria shuttle cosmid (pYUB18) is constructed, using, e.g., methods 
disclosed in Jacobs, W.R. et al (1991). "Genetic Systems for Mycobacteria," in Methods. 
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Enzymol 204, 537-555. The pYUB18 vector has a unique BamBl site that can serve as 
the site of insertion of partial 5au3A-digested chromosomal DNA. Following in vitro 
packaging, the constructed libraries are transduced into cosmid in vivo packaging strains 
to permit amplification and efficient repackaging of recombinant cosmids into 
5 bacteriophage k heads thus allowing for storage of the libraries as phage lysates. 

Example 7 - Isolating and characterizing M. tuberculosis genes w hich correspond to M 
marinum virulence genes 

In order to identify an Af. tuberculosis gene which corresponds to a particular M. 
marinum gene, an "amplified PCR fragment" from the M. marinum gene, such as that 

1 0 described in Example 5 or a fragment thereof, can be used to probe a cosmid library of Af. 
tuberculosis. Most preferably, a probe based on the corresponding Af. tuberculosis 
sequence, itself, is used. An Af. tuberculosis cosmid library is constructed by routine 
methods. Hybridization is performed as described, e.g., in Example 6. Positive cosmid 
clones are identified and the hybridizing sequences subcloned and sequenced, using 

1 5 routine, conventional, methods in the art. 

Well-defined mutations can be introduced into a cloned M tuberculosis gene, 
using the methods described herein for generating site-specific mutations in Af. marinum 
genes. The mutations can then be introduced into the Af. tuberculosis genome by 
homologous recombination. In a most preferred embodiment (as disclosed, e.g., in 

20 Balasubramanian, V. et al (1996). J. BacterioL 128, 273-279, and Reyrat, J. et al (1995). 
PNAS 92, 8768-8772), the recombination is performed with long linear recombination 
substrates containing the mutated gene (virulence gene: :aph) on a DNA fragment (>40 kb). 
This fragment is electroporated into the H37Rv strain of Af. tuberculosis selecting for 
kanamycin resistance. Chromosomal DNA from the parent H37Rv strain and the 

25 kanamycin-resistant transformants are digested with Kpnl and probed with a Kpnl 
fragment containing the virulence gene::apA fragment. The strains containing the 
disrupted allele show a signal from a fragment which is 1 .3-kb greater (aph gene) than the 
hybridizing fragment from the wild type gene clone (control). These mutant strains can 
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be tested, e.g., in the guinea pig infection model (See, e.g., Collins, D.M. et al (1995). 
PNAS 92, 8036-8040). 

Alternatively, allelic exchange can be performed using ts-sacB vectors (see, e.g., 
Pelicic et al. (1997). PNAS 94, 10955-10960). The virulence gene::a/?A construct is 
5 inserted into pJMlO, a ts-sacB E. coli - Mycobacteria vector containing the kanamycin 
resistance gene for selection. The plasmid is introduced into the H37Rv strain of M. 
tuberculosis by electroporation with selection initially at 32°C on 7H10-kanamycin. 
Transformants are selected, grown in liquid culture, and then plated at 39°C on 7H10- 
kanamycin + 2% sucrose plates. Transformants obtained on the counterselective plates 
1 0 represent allelic exchange mutants. 



Example 8 - Complementation assays 

A candidate virulence gene is reintroduced into a transposon mutant on a low copy 
number E. coli - mycobacteria shuttle vector (pYUB213Akm) (Ramakrishnan, L. et al 
(1997). J. BacterioL JLZ2, 5862-5868) to determine whether the cloned gene complements 
1 5 the virulence defect in the goldfish model. This plasmid is a derivative of pMV262 
(Stover, C.K. et al (1991). Nature 251, 456-460) with a bleomycin resistance gene for 
selection. Bacteria are recovered from those fish in which the virulence defect has been 
complemented, and analyzed for bleomycin and kanamycin resistance to confirm that the 
complementing plasmid is present. 

20 Some cloned virulence gene candidates may fail to complement the virulence 

defect in the fish model because of, e.g., instability of the cosmid clone, polar effects in 
the original mutation, requirement for a cluster of genes surrounding the interrupted gene, 
or toxic effects associated with overexpression of genes from multicopy plasmids. In 
order to overcome these problems, several alternative approaches can be used. 

25 One approach is to utilize an integrating E. coli - mycobacterial shuttle vector, 

pMV361 (Stover, C.K. et al (1991). Nature 351, 456-460). The vector integrates in a site- 
specific manner into the chromosomal attB site. This site is in a well-conserved part of 
the mycobacterial genome and has been identified in BCG, M. smegmatis p M. bovis, M 



WO 01/19993 



PCT/US00/25512 



-39- 

chelonei, M leprae, M. phlei, and M tuberculosis. Prior to the use of this vector in M. 
marinum, the presence of the attB site in M marinum is confirmed by Southern blot 
analysis of M. marinum chromosomal DNA digested with BamrU using a radiolabeled 1 .7- 
kb Sal I attB fragment from M. smegmatis. In order to use this vector in mutants which 
5 contain the kanamycin resistance gene, the vector is modified to delete the kanamycin 
gene and to insert the bleomycin gene as was done, e.g., with the construction of 
pYUB213Akm (Ramakrishnan, L.H. et al (1997). J. Bacter. 122, 5862-5868). Using an 
integrating vector eliminates the possible instability seen with extrachromosomal plasmid 
maintenance in vivo (the integrated vector is stably maintained even without antibiotic 
10 selection), and the toxic effects associated with multicopy plasmids are reduced or 
eliminated since integration results in a single copy of the gene in the chromosome. To 
address the issue that the original transposon insertion phenotype was due to a polar effect 
on a downstream gene or that a cluster of genes is required for complementation, larger 
fragments of the original cosmid clone can be inserted into the integrating plasmid. 

15 Another approach is to construct by allelic exchange specific chromosomal 

mutations in the identified virulence genes. Methods for using long linear recombination 
substrates for allelic exchange are provided, e.g., in Balasubramanian, V. et al (1996). J. 
Bacteriol. 12S, 273-279. Other methods for homologous recombination are found, e.g., 
in Aldovini, A.R. et al (1993). J. Bacteriol. JLZ5, 7282-7289; Norman, E. et al (1995). Mol. 

20 Microbiol. 16, 755-760; Baulard, A. et al (1996). J. Bacteriol. 118, 309 1 -3098; Marklund, 
B.I. et al (1995). J. Bacteriol. 121, 6100-6105; and Ramakrishnan, L. et al (1997). J. 
Bacteriol. 122, 5862-5868. These specific mutations allow the creation of non-polar 
mutations in the virulence genes. 



Example 9 - Identification and characterization of thirteen M t uberculosis virulence genes , 

25 DNA regions flanking transposon insertion points for 13 mutants were amplified 

by inverse PCR and sequenced. Predicted amino acid sequences from all six reading 
frames of the DNA sequences obtained were subjected to similarity search of the nr 
database, using the NCB1 BLAST program. The nr database includes, e.g., all non- 
redundant GenBank CDS translations, PDB, SwissProt, PIR and PRF sequences. An 
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advanced BLAST search determined whether a homologous protein sequence was present 
in the Mycobacterium tuberculosis genome. The translated flanking sequences of mutants 
41.2, 80.1, 86.1, 62.2, 67.1, 80.8, 39.2, 114.7, 32.2, 42.2, 60.2, 68.6 and 95.3 exhibited 
sequence identities with functionally homologous proteins from M. tuberculosis of 93%, 
5 42%, 37-51%, 77%, 38%, 78%, 43%, 82%, 64%, 62%, 58-77%, 38%, and 36-47%, 
respectively. 

Gene 41.2 

The sequence of the flanking region of M. marinum mutant 41 .2 is as follows: 
5'- 

1 0 CGGGCCGATCTATGACGAGNACGACGGGACAGATGGGTCCCCGGATGGTC 
TA 

CACCGAGACCAAACTGAACTCGTCGTTCTCCTTCGGCGGGCCCAAGTGTCT 
GGTGAAGGTGATCCAAAAACTGTCCGGGTTGAGCATCAACCGGTTCATCGC 
C ATCGACTTCGTCGG - 3 ' (SEQ ID NO: 4) 

1 5 This can be translated in the third reading frame to the following protein sequence: 

1 GRSMTXTTGQ MGPRMVYTET KLNSSFSFGG PKCLVKVIQK LSGLSINRFI 
51 AIDFV (SEQ ID NO: 5) 

The mutant (41.2), when tested individually in the goldfish model, exhibits 
attenuated virulence as compared to the wild type organism (See Figure 8). 

20 The gene interrupted in the attenuated mutant has been characterized by sequence 

analysis. Using the mycobacterium database, a functional homologue of this gene has 
been identified in M. tuberculosis (emb | CAA17628 [ (AL022004); ( Rv0822c). Using the 
general genomic database, the gene has been shown to be most closely related to gene 
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emb|CAA20411 1; (AL031317), a transcriptional regulator of Streptomyces coelicolor 
which belongs to the AraC family of transcriptional regulators. This suggests that the gene 
identified as interrupted in mutant 41.2 is a putative transcriptional regulator belonging to 
the AraC family. 

5 The proteins belonging to this family have at least three main regulatory functions 

in common: carbon metabolism, stress response, and pathogenesis. (See, e.g., Gallegos, 
M-T et al (1 997). Microbiology and Molecular Biology Reviews 61, 393-41 0). Certain 
of these regulatory proteins are involved in the production of virulence factors in infections 
of plants or mammals. These regulatory factors have been found in microbes that colonize 

1 0 either the gastrointestinal, respiratory, or genitourinary tracts. These proteins are involved 
in stimulation of the synthesis of proteins that play a role in adhesion to epithelial tissues, 
components of the cell capsule, and invasins. Some members of the family control the 
production of other virulence factors. Some regulators are involved in the response to 
stressors, including oxidative stress and transition from exponential growth to the 

1 5 stationary phase. Without wishing to be bound by any mechanism, these observations 
suggest that the role of this gene in M. tuberculosis pathogenesis may be in invasion of the 
macrophage, survival in the macrophage (oxidative stress) or in transition to the latent 
state of tuberculosis (transition from exponential to stationary phase). 

Gene 8(U 

20 The sequence of the flanking region of M. marinum mutant 80.1 is as follows: 
5'- 

ACCTCCTGAATGTGTGACATGGCCCTAGAACCCTGCNTTAGACTATTTACAT 
A 

CATGGCTTCACCCGGCCGCCTGTGCCACTCATAAGACTACTGGAATGGACC 
25 AACAATCGCACAGTCATCTGAAGCAGGAGTCTGTTAATCACAGGCCCTGAA 
GGAACAGTGACTGTGCAGAGAAAGACGGCAATGCATCCTGTTAACTAAGT 
GGCTGGAGGAGTGCCAGGTCATTCCAAAGAACATCCCTGAAATCTGGAGG 
AGAAGGTATAGTGAGCACCCCAAAATTTCAACTGGAGACATCANACCAGA 
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GTCTCTACTGAGCTGCCAAGCTTGCGGCCGCACTCGAGTAACTAGTTAACC 
CCTTGGGGCCTCTAAACGGGTCTTGA - 3' (SEQ ID NO: 6) 

This can be translated in the second reading frame to the following protein sequence: 
1 PPECVTWP*N PALDYLHTWL HPAACATHKT TGMDQQSHSH LKQESVNHRP 
5 51 *RNSDCAEKD GNASC*LSGW RSARSFQRTS LKSGGEGIVS TPKFQLETSX 
QSLY*AAKLA AALE*LVNPL GPLNGS* (SEQ ID NO: 7) 

The mutant (80.1), when tested individually in the goldfish model, exhibits 
attenuated virulence as compared to the wild type organism (See Figures 9 and 12). 

The gene interrupted in the attenuated mutant has been characterized by sequence 
analysis, as described above for mutant 41.2. Functional homologues of this gene have 
been identified in M. leprae ( sp|P54580[ YV23 MYCLE : B2168 C2 209) and M. 
tuberculosis f sp | Ol 1 162 [ YV23 MYCTU : CY20G9.23). Based on the sequence analysis, 
the gene identified as interrupted in mutant 80.1 is a hypothetical integral membrane 
protein, most closely related to a glutamate receptor channel, dbi |BAA02254.1 (Dl 2822), 
from Mus musculus. 

Gene 86.1 

The sequence of the flanking region of M. marinum mutant 86.1 is as follows: 
5'-TCATCGCTAACCGGTTGAGCTACCGCCCGCACAGCGTGCCCATCATCTC 



10 



15 



CAACCTGACCGGCTCACTTGCCACAGTCGAGCAACTCACATCGCCCCGCTA 
20 TTGGGCACAGCATGTACGGGAGCCAGTGCGGTTTCATGACGGCGTTACCGG 
CTTGTTGGC AGGCGG AGAACA-3 ' (SEQ ID NO: 8) 
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This can be translated in the third reading frame to the following protein sequence: 

1IANRLSYRPHSVPIISNLTGSLATVEQLTSPR 
YWA QHVREPVRFHDGVTG LL A G G E (SEQ ID NO: 9) 

The mutant (86.1), when tested individually in the goldfish model, exhibits 
5 attenuation in virulence as compared to the wild type organism (See Figures 10 and 12). 

The gene interrupted in the attenuated mutant has been characterized by sequence 
analysis, as described above for mutant 41.2. A family of functional homologues of this 
gene has been identified in M tuberculosis ( emb [ CAB06094 [ Z838S7 1 p psE: 
emb ( CAB0660S [ Z84725 | pks6 : emb [ C AB09 1 00 [ Z956 1 7 [ pks9 : 
10 emb | CAB09098 [ Z95617 [ p ks8: emb | CAB06103 1 Z83858 [ p ksl : pir | S73075 1 pks002c 
protein). Based on the sequence analysis, the gene identified as interrupted from mutant 
86.1 is a polyketide synthase gene, most closely related to polyketide synthase genes 
AF263912 (Streptomyces noursei) and AF0 15823 (Streptomyces venezuelae). 

Polyketides are lipid-like molecules that have potent biological activities. 

15 Examples of polyketides include antibiotics (erythromycin), immunosuppressants 
(rapamycin, FK506), antifiingal agents (amphotericin B), antihelminthic agents 
(avermectin), and cytostatins (bafilomycin). A polyketide toxin has been recently 
described in Mycobacterium ulcerans (George, K.M. et al (1999). Science 232, 854-856) 
but no homologue was identified by sequence analysis in M tuberculosis. Although it was 

20 recognized during analysis of the M. tuberculosis genome project that the genome contains 
a large number of polyketide synthesis genes, no polyketides from M. tuberculosis have 
been identified. That we have identified that a mutation in this gene attenuates the M. 
marinum strain in virulence suggests that although a polyketide toxin has not been 
identified, a product of this synthesis pathway is responsible for virulence. Without 

25 wishing to be bound to any mechanism, these observations suggest that a product of the 
polyketide synthesis pathway may be responsible for the tissue destruction and 
immunological modulation characteristic of diseases such as leprosy and tuberculosis. 
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Gene 62.2 

The sequence of the flanking region of M. marinum mutant 62.2 is as follows: 

GATCCGGTGCCGCCTTGACCGGCCGCGCCACCAGTACCGCCGACGCCGCCC 
T G 
GCCGCCGGCTTGTGCGGCTTGCGATGGGTCGGTGCTGTCGGTGCCGGTGCC 
TCCGGTGCCGCCTTGGCCTCCGGTTCCGCCGGTGCCGCCCTGGCCGCCGGC 
GCCTTGGATGCCGCCGGTGCCGGTTCCGGCTGCACCGCCCGTTCCGCCGGT 
TCCGCCTGCGCCGCCGGTGCCT (SEQ ID NO: 1 0) 

This can be translated in the -2 reading frame to the following protein sequence: 
227 ggcaccggcggcgcaggcggaaccggcggaacgggcggtgcagcc 

GTGGAGGT GGTGGAA 
1 82 ggaaccggcaccggcggcatccaaggcgccggcggccagggcggc 

GTGTGGIQGAGGQGG 
1 37 accggcggaaccggaggccaaggcggcaccggaggcaccggcacc 

TGGTGGQGGTGGTGT 
92 gacagcaccgacccatcgcaagccgcacaagccggcggccagggc 

DSTDPS QAAQAGGQG 

47 ggcgtcggcggtactggtggcgcggccggtcaaggcggcaccgga 

GVGGTGGAAGQGGTG (SEQ ID NO: 12) 

2tcl (SEQ ID NO: 11) 

The mutant (62.2), when tested individually in the goldfish model, exhibits 
attenuated virulence (reduced Competitive Index) as compared to the wild type organism 
(See Figure 12). 
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The gene interrupted in the attenuated mutant has been characterized by sequence 
analysis, as described above for mutant 41.2. Using either the mycobacterium or the 
general genomic database, a functional homologue of this gene has been identified in M 
tuberculosis ( emb j CAA 17748.1 1 (AL022022); ( Rv351 1). 

5 This is a hypothetical glycine-rich protein (Rv3511) belonging to a large At. 

tuberculosis PE- PGRS protein family, which comprises roughly 5% of the coding DNA 
of At. tuberculosis. The genes of this family are scattered throughout the genome of M. 
tuberculosis and other closely related mycobacteria. This family is characterized by a 
relatively conserved amino acid NH 2 -terminus. The function of these proteins is unknown 

10 but some hypotheses are that they represent a source of antigenic diversity or that their 
glycine repeats inhibit host major histocompatibility complex class I processing, akin to 
the glycine repeats of the Epstein-Barr virus EBNA-1 protein. That we have identified that 
a mutation in this gene attenuates the At. marinum strain in virulence suggests that the 
protein product of this gene is responsible for the immunological modulation characteristic 

1 5 of diseases such as leprosy and tuberculosis. 



Gene 67.1 

The sequence of the flanking region of At. marinum mutant 67.1 is as follows: 

GGTCGAAGACTATCGGTATGCTCCATAGCGTTCCGTCGGGAAGCTGCATGT 
TGTCAAGGGTTTCGTCGACCTCTCGGCGACCCATGAATCCCGATAGTGGCG 
20 TGAAGAAACCGTACGAGATGCTGATCACCTCGTGGGCGGTCGCCTTCGATA 
TCGGGATGCGCACCAATCCCTCAATCCGGCCGGCCACGTTTTCCCTTTCCAC 
CCTGTCGACGAGTGGGTGTCCGTTATGGCCTAAATAATCCATCTTGCTGCCT 
CTTTCTGAAATCGAATTTATTACTATCG (SEQIDNO:13) 

This can be translated in the six reading frames to the following protein sequences: 
25 DNA: GGTCGAAGACTATCGGTATGCTCCATAGCGTTCCGTCGGGAAGCTGCATGT 
+3: SKTIGMLHSVPSGSCML 
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+2:VEDYRYAP*RSVGKLHV 
+1:GRRLS VCSI AFRREAAC 

DNA: TGTCAAGGGTTTCGTCGACCTCTCGGCGACCCATGAATCCCGATAGTGGCG 
+3: SRVSSTSRRPMNPDSGV 

+2:VKGFVDLSATHESR*WR 

+1:CQGFRRPLGDP*IPIVA 

DNA: TGAAGAAACCGTACGAGATGCTGATCACCTCGTGGGCGGTCGCCTTCGATA 
+3: KKPYEMLITSWAVAFDI 
+2:EETVRDADHLVGGRLRY 
+1:* RNRTRC*SPRGRSPSI 

DNA: TCGGGATGCGCACCAATCCCTCAATCCGGCCGGCCACGTTTTCCCTTTCCA 
+3: GMRTNPSIRPATFSLST 
+2:RDAHQSLNPAGHVFPFH 
+1:SGCAPIPQSGRPRFPFP 

DNA: CCCTGTCGACGAGTGGGTGTCCGTTATGGCCTAAATAATCCATCTTGCTGC 
+3: LSTSGCPLWPK*SILLP 
+2:PVDEWVSVMA*IIHLAA 
+1:PCRRVGVRYGLNNPSCC 
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DNA: CTCTTTCTGAAATCGAATTTATTACTATCG 



(SEQIDNO: 13) 



+3: L S E I E F I T I 



(SEQIDNO: 14) 



+2:SF*NRIYYY 



(SEQIDNO: 15) 



+1:L FLKSNLLLS 



(SEQIDNO: 16) 



DNA: CGATAGTAATAAATTCGATTTCAGAAAGAGGCAGCAAGATGGATTATTTAG 
-1:R***IRFQKEAARWII* 

-2:DSNKFDFRKRQQD.GLFR 
-3: IVINSISERGSKMDYLG 

DNA: GCCATAACGGACACCCACTCGTCGACAGGGTGGAAAGGGAAAACGTGGCCG 
-1:A ITDTHSSTGWKGKTWP 

-2:P*RTPTRRQGGKGKRGR 

-3: HNGHPLVDRVERENVAG 

DNA: GCCGGATTGAGGGATTGGTGCGCATCCCGATATCGAAGGCGACCGCCCACG 
-1:A GLRDWCASRYRRRPPT 
-2:PD*GIGAHPDIEGDRPR 
-3: RIEGLVRIPISKATAHE 
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DNA: AGGTGATCAGCATCTCGTACGGTTTCTTCACGCCACTATCGGGATTCATGG 
-1:R*SASRTVSSRHYRDSW 
-2:GDQHLVRFLHATIGIHG 
-3: VISISYGFFTPLSGFMG 



5 DNA: GTCGCCGAGAGGTCGACGAAACCCTTGACAACATGCAGCTTCCCGACGGAA 
■1:V A E RST KP LTTC S FPTE 
-2:SPRGRRNP*QHAASRRN 
-3: RREVDETLDNMQLPDGT 

DNA: CGCTATGGAGCATACCGATAGTCTTCGACC 
10 -1:RYGAYR*SST 
-2:AMEHTDSLR 
-3: L W S I P I V F D 



(SEQIDNO: 17) 
(SEQIDNO: 18) 
(SEQIDNO: 19) 
(SEQ ID NO: 20) 



The mutant (67.1), when tested individually in the goldfish model, exhibits 
attenuated virulence as compared to the wild type organism (See Figures 12 and 13). 

1 5 The gene interrupted in the attenuated mutant has been characterized by sequence 

analysis, as described above for mutant 41.2. Using the mycobacterium database, a 
functional homologue of this gene has been identified in M. tuberculosis 
( emb | CAB08565.1 1 (Z95324) p urA. This homologue, in the +2 frame, with an identity 
38% (similarity of 57%), is an adenylosuccinate synthetase (M. tuberculosis homologue 

20 008381 ). This protein product plays an important role in the de novo pathway of purine 
nucleotide biosynthesis. Thus in the host animal, particularly in the macrophage where 
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nutrients may be limiting the product of this gene may be required for survival of 
Mycobacterium marinum and M. tuberculosis. 

Based on the sequence analysis to the entire genomic database, the gene identified 
as interrupted from mutant 67.1 is a sulfate adenylyltransferase with homology to diverse 
5 organisms including Pyrococcus abyssi, Synechocystis sp., and Bacillus subtilis. The 
homology is in the -3 reading frame of the translated gene product and shows 27-40% 
identity (51-62% similar). The homology noted to the sulfate adenylyltransferase enzymes 
suggests that mutant 67.1 is attenuated in its ability to respond to sulfate starvation as this 
enzyme is required for growth in defined synthetic medium with sulfate as a sulfur source. 
1 0 This suggests that in the animal host a sulfur source is limiting and thus interruption of this 
gene attenuates growth of the organism in the animal host. Thus interruption of this gene 
in a live attenuated Mycobacterium vaccine strain would be beneficial, as it will limit the 
ability of the vaccine strain to grow in the animal host. 

Gene 80.8 

1 5 The sequence of the flanking region of M marinum mutant 80.8 is as follows: 

CCAATTAGCTGATTATTCCTCGGGCGTGCTCAACGCCAAGGACTACATATC 
AGGTTACTTCCACTAAAATTCGCGGGCCCCGATCGGCGACATTACTCGACG 
GTTTTCGGGGGAATCTCAGCGGTGATGGCATTCTTGAGGGCGACGTAGCGT 
TTGGCGTCGGGATC (SEQ ID NO: 21) 



20 



This can be translated in the -1 reading frame to the following protein sequence: 
DPDAKRYVALKNAITAEIPPKTVE*CRRSGPANFSGSNLICSPWR*ARPR 
NNQLI (SEQ ID NO: 22) 
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The mutant (80.8), when tested individually in the goldfish model, exhibits 
attenuation in virulence (reduced Competitive Index) as compared to the wild type 
organism (See Figure 12). 

The gene interrupted in the attenuated mutant has been characterized by sequence 
5 analysis, as described above for mutant 41.2. Using either the mycobacterium or or the 
general genomic database, a functional homologue of this gene has been identified in M 
tuberculosis ( emb | CAB02482. 1 \ Z80343 [ lipE. This is a probable carboxylic-ester 
hydrolase (M tuberculosis homologue Rv3775) also referred to as an esterase or lipE. The 
homology is in the -1 reading frame with 83% similarity, 78% identity. This gene may 

10 have a role in fatty acid synthesis in Mycobacterium species or may be involved in 
establishment or dissemination in the animal host by destruction of the host cell fatty acids 
present in the host cell membrane. That we have identified that a mutation in this gene 
attenuates the M marinum strain in virulence suggests that the protein product of this gene 
is responsible for the virulence attributes of Mycobacterium species and may contribute 

1 5 to the establishment of diseases such as leprosy and tuberculosis. 

Gene 39.2 

The sequence of the flanking region of M. marinum mutant 39. is as follows: 

GATCCGCTGGACGGCACCAAAGAATTCATCAAGGGCAGCGATGAGTTCAC 
CGTCAACATCGCCCTGGTCGAGAACCAGGAACCCATTCTCGGGGCAATCTA 
20 CGGTCCAGCGAAGCAACTTCTGCACTACGCGGCCAAAGGGGCT (SEQ ID NO: 
23) 

This can be translated in the +1 reading frames to the following protein sequence: 



7 ctggacggcaccaaagaattcatcaagggcagcgatgagttcacc 
LDGTKEFIKGSDEFT 
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52gtcaacatcgccctggtcgagaaccaggaacccattctcggggca 

VNIALVENQEPILGA 
97atctacggtccagcgaagcaacttctgcactacgcggccaaaggg 

I YGP AKQLLHYAAKG 
5 142 get 144 (SEQIDNO:43) 

A (SEQIDNO:24) 



The mutant (39.2), when tested individually in the goldfish model, exhibits 
attenuation in virulence as compared to the wild type organism (See Figure 14). 

The gene interrupted in the attenuated mutant has been characterized by sequence 
10 analysis, as described above for mutant 41.2. Using the mycobacterium database, a 
functional homologue of this gene has been identified in M. tuberculosis 
(emb [ CAB06277.1 1 Z8386 1 hypothetical protein Rv3137V This homologue, in the +1 
frame, with an identity 43% (similarity of 63%), is a probable inositol monophosphate 
phosphatase, because it contains an inositol monophosphatase family signature sequence. 
15 It is related to the cysQ proteins identified in the whole database search described below, 
which also belong to the inositol monophosphatase family. 

Based on a sequence analysis to the entire genomic database, the gene identified 
as interrupted from mutant 39.2 is predicted to be a structural protein of an ammonium 
transport system (also known as a cysQ gene). This protein affects the pool of 3'- 

20 phosphoadenosine -5'-phosphosulfate in the pathway of sulfite synthesis. The identity 
is in the +1 reading frame of the translated gene product and is 53-65% identical (63-82% 
similar). The homology noted suggests that mutant 39.2 is attenuated in its ability to 
respond to sulfate starvation as this enzyme is required for growth in defined synthetic 
medium with sulfate as a sulfur source. This suggests that in the animal host a sulfur 

25 source is limiting and thus interruption of this gene attenuates growth of the organism in 
the animal host. Thus interruption of this gene in a live attenuated Mycobacterium vaccine 
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strain would be beneficial, as it will limit the ability of the vaccine strain to grow in the 
animal host. 

Gene 114.7 

The sequence of the flanking region of M. marinum mutant 1 14.7 is as follows: 

5 AGCCGTATTTCGCCATTGAGAGTTGGGGTCTTGAGATCGGCACTGGAAGGG 
GACAGCGTGCTATTGCCTCTTGGTCCGCCCTTGCCACCTGATGCTGTGGCGG 
CTAAACGGGGTGAGTCGGGGCTGCTCTGCGGCTTGTCGGTTCCGCTCAGCT 
GGGGTACGGCCGTTCCGCCGGATGACTACNACCATTGGGCACCGGAGCCTG 
AAGAAGGCGCCGAGGCCGTGGTCGAAGAAAACGTGGATGCGGCAGCTGCC 
1 0 GGTACCGACGAGTGGGACGAGTGGGCGGAATGGAGGGAGTGGGAGGCAG 
CAAATGCCCGAACCTCATTTTCGAGATGCCCCGTACCAGCAGCCGTGATAC 
CCGAACTCGCCGGCGGCCGGTTGAGA (SEQ ID NO: 25) 

This can be translated in the +1 reading frames to the following protein sequence: 
1 6 ttgagagttggggtcttgagatcggcactggaaggggacagcgtg 
15 LRVGVLRSALEGDSV 

6 1 ctattgcctcttggtccgcccttgccacctgatgctgtggcggct 

LLPLGPPLPPDAVAA 
1 06 aaacggggtgagtcggggctgctctgcggcttgtcggttccgctc 
KRGESGLLCGLSVPL 
20 151 agctggggtacggccgttccgccggatgactacnaccattgggca 
SWGTAVPPDDYXHWA 
1 96 ccggagcctgaagaaggcgccgaggccgtggtcgaagaaaacgtg 
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PEPEEGAEAVVEENV 
24 1 gatgcggcagctgccggtaccgacgagtgggacgagtgggcggaa 

DAAAAGTDEWDEWAE 
286 tggagggagtgggaggcagcaaatgcccgaacctcattttcgaga 

WREWEAANARTSFSR 
331 tgccccgtaccagcagccgtgatacccgaactcgccggcggccgg 

CPVPAAVIPELAGGR 
376 ttgaga 38 1 (SEQ ID NO: 44) 

L R (SEQ ro NO: 26) 

The mutant (1 14.7), when tested in pools in the goldfish model, appears to exhibit 
attenuation in virulence as compared to the wild type organism. 

The gene interrupted in the attenuated mutant has been characterized by sequence 
analysis. Using either the mycobacterium or the general genomic database, a functional 
homologue of this gene has been identified in M. tuberculosis (pir E70662); (Rv2348c). 
The homology is in the +1 reading frame, with an identity of 82% (similarity 84%), to a 
hypothetical protein of M tuberculosis. This protein is of unknown function as it has no 
known homology to any other sequence in the database. Extrapolating from the animal 
model, it appears that this gene is a virulence gene in M. marinum and M tuberculosis. 

Mutant 32.2 

The sequence of the flanking region of M. marinum mutant 32.2 is as follows: 

TCCANNCAGAGGNGCACGTAGANCGTAGGACGGAANGCGGNGNGATCGNC 
AATACGGCTGGCNCTGCNAGAACTGNTCGAGGGCCTGCNGCTGGGGCC 
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(SEQ ID NO: 27) 

This can be translated in the -2 reading frame to the following protein sequence: 
APAAGPRXVLAXPAVLXIXPXSVLRSTCXSXW (SEQ ID NO: 28) 

The mutant 32.2, when tested individually in the goldfish model, exhibits 
5 attenuated virulence (reduced Competitive Index, see Figure 12) as compared to the wild 
type organism. 

The gene interrupted in the attenuated mutant has been characterized by sequence 
analysis. Using the Mycobacterium database, a functional homologue of this gene has 
been identified in Af. tuberculosis (emb CAB06230 (Z83864) (Rv3860). This is a gene 

10 encoding a hypothetical protein of unknown function with homology to other 
Mycobacterium proteins also of unknown function including [emb CAB08086 (Z94121) 
(Rv3888c); emb CAA75199 (Y14967); emb CAA17968 (AL022120) (Rv3876); emb 
CAB08981 (Z95558) (Rv0530) and emb CAA15582 (AL008967) (Rv2787)]. That we 
have identified that a mutation in this gene attenuates the Af. marinum strain in virulence 

15 suggests that the protein product of this gene contributes to the disease process in 
tuberculosis and leprosy. The interruption of this gene in a live attenuated Mycobacterium 
vaccine strain would be beneficial, as it will limit the ability of the vaccine strain to grow 
in the animal host. 

The homology with the Af. tuberculosis homologue is 64% identity, 78% 
20 similarity. 

Mutant 42.2 

The sequence of the flanking region of Af. marinum mutant 42.2 is as follows: 

TTTGCAATCCACCTGTACGCGGAACW 
ATAAGCTAGCT (SEQ ID NO: 29) 
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This can be translated in the -1 reading frame to the following protein sequence: 
S*LIRQGKTXXXSSAYRWIA (SEQ ID NO: 30) 

The mutant 42.2, when tested individually in the goldfish model, exhibits 
attenuated virulence (reduced Competitive Index, see Figure 12 and decreased virulence 
5 in LD50 experiment. Figure 1 5) as compared to the wild type organism. 

The gene interrupted in the attenuated mutant has been characterized by sequence 
analysis. Using the Mycobacterium database, a functional homologue of this gene has 
been identified in At. tuberculosis (emb CAB03756 (Z81371) (mbtB). This is a gene 
involved in mycobactin biosynthesis. At. tuberculosis produces both cell associated 

10 mycobactins and secreted, water-soluble mycobactins. Both types are siderophores and 
act to scavenge iron from the environment to support growth of the organism. The genes 
involved in mycobactin synthesis are contained in an operon. That we have identified that 
a mutation in this gene attenuates the At. marinum strain in virulence suggests that iron is 
required for Mycobacterium growth in the animal host. The interruption of this gene in 

15 a live attenuated Mycobacterium vaccine strain would be beneficial, as it will limit the 
ability of the vaccine strain to grow in the animal host. 

The homology with the At. tuberculosis homologue is 62% identity, 99% 
similarity. 

Mutant 60.2 

20 The sequence of the flanking region of M marinum mutant 60.2 is as follows: 

CCANACCTATCTGTTTNCAGNTTNAGACNACGGNATCTCACGCGNTTGGGC 
CCNGCC ACCAAACGCCGCGTNGA (SEQ ID NO: 3 1 ) 
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This can be translated in six reading frames to the following protein sequences: 
DNA: CCANACCTATCTGTTTNCAGNTTNAGACNACGGNATCTCACGCGNTTGGGC 
+3: XP1CXQXXTTXSHAXGP 
+2:XTYLFXXXDXGISRXWA 
+1:PXLSVXXXRXRXLTRLG 

DNA: CCNGCCACCAAACGCCGCGTNGA (SEQ ID NO: 3 1) 

+3: X H Q T P R X (SEQlDNO: 32) 

+2: X P P N A A X (SEQ ID NO: 33) 

+1: P A T K R R V (SEQ ID NO: 34) 

>60.2/T89 T87 removed 

DNA: TCNACGCGGCGTTTGGTGGCNGGGCCCAANCGCGTGAGATNCCGTNGTCTN 
■1:STRRLV AGPXRVRXRXL 
-2:XRGVWWXGPXA*DXVVX 
-3: XAAFGGXAQXREXPXSX 

DNA: AANCTGNAAACAGATAGGTNTGG (SEQ ID NO: 35) 

-1: X L X T D R X (SEQ ID NO: 36) 

-2: X X-K Q I G X (SEQ ID NO: 37) 

-3: X X N R * V W (SEQ ID NO: 38) 
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The mutant 60.2, when tested individually in the goldfish model, exhibits 
attenuated virulence (reduced Competitive Index, see Figure 12) as compared to the wild 
type organism. 

The gene interrupted in the attenuated mutant has been characterized by sequence 
analysis. Using the Mycobacterium database, functional homologues of this gene have 
been identified in M tuberculosis [emb CAA17485 (AL021957) (Rv2181); emb 
CAB06507 (Z84498) (Rvl954c); emb CAA17586 (AL021999) (Rv0987); emb 
CAB07087 (Z92771) (Rv3268); emb CAB08632 (Z95387) (Rv2610c)]. This is a gene 
encoding a hypothetical integral membrane protein of unknown function. That we have 
identified that a mutation in this gene attenuates the M marinum strain in virulence 
suggests that it is required for Mycobacterium growth in the animal host. The interruption 
of this gene in a live attenuated Mycobacterium vaccine strain would be beneficial, as it 
will limit the ability of the vaccine strain to grow in the animal host. 

The homology with the M tuberculosis homologue Rv 2181 is 58% identity, 66% 
15 similarity, overall homology with all the genes identified is 58-77% identity, 66-88% 
similarity. 



5 



10 



Gene 68.6 

The sequence of the flanking region of M marinum mutant 68.6 is as follows: 

AAATCATCATCTATCGTTACCCGGGGCAAGCCAAGCACCTCAGCAAAAATT 
20 CTGCAGAGCATTTGCTCTTGCGGAGTTCGCGGCATACGGCCAATCGCCGCA 
TGATGATCGGGCACAGGCAGCGCTTTACGATCCACCTTCTTATTCGGAGTT 
AACGGCATGGTCTCAAGTCTTACGATGACAGACGGCACCATATATTCGGCC 
AGTTTCAGGGAGGCGTAGCGCCGCAGTTCTGCTGTATCTATCA 
(SEQIDNO: 39) 



25 



This can be translated in the -3 reading frame to the following protein sequence: 
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. 1 IDTAELRRYA SLKLAEYMVP SVIVRLETMP LTPNKKVDRK ALPVPDHHAA 
IGRMPRTPQE EMLCRIFAEV LGLPRVTIDD D (SEQ ID NO: 40) 



The mutant (68.6), when tested individually in the goldfish model, exhibits 
attenuated virulence (reduced Competitive Index) as compared to the wild type organisms 
5 (Figure 12). 

The gene interrupted in the attenuated mutant has been characterized by sequence 
analysis. Using the mycobacterium database, a functional homologue of this gene has 
been identified in M. tuberculosis (pir E70751 emb CAA98937 Z74410); (nrp protein). 
The homology is in the -3 reading frame, with an identity of 43% (similarity 62%), to a 

10 probable nrp protein of M tuberculosis. This protein belongs to a superfamily of acetate 
CoA ligase proteins involved in peptide synthesis. A second protein of M. tuberculosis 
also shows significant homology. This protein is the mbtE protein (pir C70588 emb 
CAB08481 Z95208). The homology is again in the -3 reading frame, with an identity of 
38% (similarity 56%). This is a gene involved in mycobactin biosynthesis. M 

15 tuberculosis produces both cell associated mycobactins and secreted, water-soluble 
mycobactins. Both types are siderophores and act to scavenge iron from the environment 
to support growth of the organism. The genes involved in mycobactin synthesis are 
contained in an operon. 

Searching against the entire database, we have identified significant homologues 
20 in Bacillus subtilis. The gene homologue is dhbF a gene encoding the 2,3- 
dihydroxybenzoate biosynthesis. The gene has been identified as essential for the 
synthesis of a siderophore in B. subtilis. 

Mutant 95.3 

The sequence of the flanking region of M marinum mutant 95.3 is as follows: 

25 GATTAGCTTATTCCTCAAGGCACGAGCGATTAGCTTATTCCTCAAGGCACG 
AGCGACTAGCTTATTCCTCAAGGCACGAGCTTCGCACTTGACGGTGTAGAG 
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CTCAATAGCTTATTCCTCAAGGCACGAGCTCGACTTCGCACTTGACGGTGT 
AG AGCTC AAAG (SEQ ID NO: 4 1 ) 



This can be translated in the +1 reading frame to the following protein sequence: 
1 D*LIPQGTSD*LIPQGTSD*LIPQGTSFALDGVELNSLFLKARARLRT*R 
52CRAQ (SEQ ID NO: 42) 



The gene interrupted in the attenuated mutant has been characterized by sequence 
analysis. Using the Mycobacterium database, functional homologues of this gene have 
been identified in M. tuberculosis [pir B70963 emb CAB0717 (Z92669) (Rv0236c); pir 
B70748 emb CAA98982 (Z74697) smc protein]. This is a gene encoding a hypothetical 
integral membrane protein of unknown function. That we have identified that a mutation 
in this gene attenuates the M. marinum strain in virulence suggests that it is required for 
Mycobacterium growth in the animal host. The interruption of this gene in a live 
attenuated Mycobacterium vaccine strain would be beneficial, as it will limit the ability 
of the vaccine strain to grow in the animal host. 

The homology with the M tuberculosis homologue Rv 0236c is 36% identity, 64% 
similarity and with the smc protein is 47% identity, 61% similarity. 

From the foregoing description, one skilled in the art can easily ascertain the 
essential characteristics of this invention, and without departing from the spirit and scope 
thereof, can make changes and modifications of the invention to adapt it to various usage 
and conditions. 

Without further elaboration, it is believed that one skilled in the art can, using the 
preceding description, utilize the present invention to its fullest extent. The preceding 
preferred specific embodiments are, therefore, to be construed as merely illustrative, and 
not limitative of the remainder of the disclosure in any way whatsoever. 
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The entire disclosure of all applications, patents and publications, cited above and 
in the figures are hereby incorporated by reference. 
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We claim; 

1. A method for identifying a virulence gene of M marinum, comprising 

a) mutagenizing an M marinum bacterium by introducing into the bacterium 
a plasmid which comprises a signature-tagged transposon, whereby the transposon 

5 integrates into and disrupts a gene in the bacterium, 

b) introducing the mutagenized bacterium into a host susceptible to infection 

thereof, 

c) identifying a bacterium which comprises a signature tagged transposon and 
which exhibits reduced viability in the host, compared to a non-mutagenized M marinum 

10 bacterium, 

d) cloning and/or sequencing a nucleic acid sequence which flanks the 
integrated transposon in said identified bacterium, and 

e) identifying a wild type M. marinum gene which comprises at least a portion 
of said flanking sequence. 

1 5 2. The method of claim 1 , further comprising 

f) confirming that the mutation renders M. marinum less virulent. 

3. A method of constructing an avirulent M. marinum bacterium, comprising 
mutagenizing an M. marinum virulence gene identified by the method of claim 1 . 

4. An avirulent M. marinum bacterium, produced by the method of claim 3. 

20 5. An avirulent M marinum bacterium, in which one or more genes comprising 

a nucleic acid ofSEQ ID NOs: 4, 6, 8, 10, 11, 13, 21, 23, 25, 27, 29, 31, 35, 39, 41, 43 or 
44 is mutated. 



25 



6. A method for identifying a virulence gene of M tuberculosis, comprising 
identifying a virulence gene of M. marinum bacterium according to the method of claim 
1, and further comprising, 
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comparing said flanking nucleic acid sequence to a databank of M. 
tuberculosis nucleic acid sequences, and/or comparing the sequences of peptides which 
are coded for by said flanking sequences to a known M. tuberculosis protein database, and 

identifying an M. tuberculosis gene which comprises a sequence that is 
5 substantially identical to said flanking sequences. 

7. A method for generating an avirulent M tuberculosis bacterium, comprising 
mutagenizing an M tuberculosis virulence gene identified by the method of claim 6. 

8. An avirulent M tuberculosis bacterium, produced by the method of claim 7. 

9. An avirulent M. tuberculosis bacterium, in which one or more of genes 
10 Rv0822c, CY20G9.23 (Rv0497), the pks family, including e.g., ppsE (Rv2935), psk6 

(Rv0405), pks9 (Rvl664), pks8 (Rvl662), pksl (Rv2946c), and pks002c, Rv3511, 
008381 (Rv0357c), Rv3775, Rv3137, Rv2348c, Rv3860, mbtB (Rv2383c), Rv2181, 
Rvl954c, Rv0987, Rv3268, Rv2610c, nrp (pir E70751, RvOlOl), mbtE (Rv2380c), 
Rv0236c or smc (Rv2922c) is mutated to render the M. tuberculosis bacterium less 
15 virulent. 

10. An avirulent M. tuberculosis bacterium of claim 9, in which gene Rv0822c 
is mutated. 

1 1 . An avirulent M. tuberculosis bacterium of claim 9, in which gene CY20G9.23 
is mutated. 

20 12. An avirulent M tuberculosis bacterium of claim 9, in which gene ppsE is 

mutated. 

13. An avirulent M. tuberculosis bacterium of claim 9, in which gene pks6 is 
mutated. 

14. An avirulent M. tuberculosis bacterium of claim 9, in which gene pks9 is 
25 mutated. 



15. 

mutated. 



An avirulent M. tuberculosis bacterium of claim 9, in which gene pks8 is 
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16. An avimlent M. tuberculosis bacterium of claim 9, in which gene pksl is 
mutated. 

17. An avimlent M. tuberculosis bacterium of claim 9, in which gene pks002c is 
mutated. 

5 1 8. An avimlent M. tuberculosis bacterium of claim 9, in which gene Rv35 1 1 is 

mutated. 

19. An avimlent M. tuberculosis bacterium of claim 9, in which gene 008381 is 
mutated. 

20. An avimlent M. tuberculosis bacterium of claim 9, in which gene Rv3775 is 
10 mutated. 

21. An avimlent M. tuberculosis bacterium of claim 9, in which gene Rv3 1 37 is 
mutated. 

22. An avimlent M tuberculosis bacterium of claim 9, in which gene Rv2348c 
is mutated. 

1 5 23. An avimlent M. tuberculosis bacterium of claim 9, in which gene Rv3860 is 

mutated. 

24. An avimlent M. tuberculosis bacterium of claim 9, in which gene mbtB is 
mutated. 

25. An avimlent M. tuberculosis bacterium of claim 9, in which gene Rv21 81, * 
20 Rvl954c, Rv0987, Rv3268, or Rv2610c is mutated. 

26. An avimlent Af. tuberculosis bacterium of claim 9, in which gene nrp (pirE 
70751) or mbtE is mutated. 

27. An avimlent M. tuberculosis bacterium of claim 9, in which gene Rv0236c 
or smc is mutated. 

25 28. An isolated nucleic acid comprising the oligonucleotide of SEQ ID NO: 4, 

or a variant or fragment thereof. 
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29. An isolated nucleic acid comprising the oligonucleotide of SEQ ID NO: 6, 
or a variant or fragment thereof. 

30. An isolated nucleic acid comprising the oligonucleotide of SEQ ED NO: 8, 
or a variant or fragment thereof 

31. An isolated nucleic acid comprising the oligonucleotide of SEQ ID NO: 1 1 , 
or a variant or fragment thereof 

32. An isolated nucleic acid comprising the oligonucleotide of SEQ ID NO: 1 3, 
or a variant or fragment thereof 

33. An isolated nucleic acid comprising the oligonucleotide of SEQ ID NO: 2 1 , 
or a variant or fragment thereof 

34. An isolated nucleic acid comprising the oligonucleotide of SEQ ID NO: 23, 
or a variant or fragment thereof 

35. An isolated nucleic acid comprising the oligonucleotide of SEQ ID NO: 25, 
or a variant or fragment thereof 

36. An isolated nucleic acid comprising the oligonucleotide of SEQ ID NO:27, 
or a variant or fragment thereof 

37. An isolated nucleic acid comprising the oligonucleotide of SEQ ID NO: 29, 
or a variant or fragment thereof 

38. An isolated nucleic acid comprising the oligonucleotide of SEQ ID NO: 3 1 , 
or a variant or fragment thereof 

39. An isolated nucleic acid comprising the oligonucleotide of SEQ ID NO: 39, 
or a variant or fragment thereof 

40. An isolated nucleic acid comprising the oligonucleotide of SEQ ID NO: 4 1 , 
or a variant or fragment thereof 

41. A pharmaceutical composition, comprising an avirulent M. marinum 
bacterium of claim 5 and a pharmaceutical^ acceptable carrier. 
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42. An attenuated Af. marinum vaccine, comprising an avirulent Af. marinum 
bacterium of claim 5 and a pharmaceutical ly acceptable carrier. 

43. A pharmaceutical composition, comprising an avirulent Af. tuberculosis 
bacterium of claim 9 and a pharmaceutical^ acceptable carrier. 

5 44. An attenuated A/, tuberculosis vaccine, comprising an avirulent M. 

tuberculosis bacterium of claim 9 and a pharmaceutical^ acceptable carrier. 

45. An attenuated Af. tuberculosis vaccine, comprising an avirulent Af. 
tuberculosis bacterium which comprises one or more mutations in one or more virulence 
genes identified by the method of claim 7 and a pharmaceutical^ acceptable carrier. 

10 46. A method to elicit an immune response in a fish in need of such treatment, 

comprising administering to said fish an avirulent Af. marinum bacterium of claim 5. 

47. A method to elicit an immune response in a patient in need of such treatment, 
comprising administering to said patient an avirulent Af. tuberculosis bacterium of claim 
9. 

15 48. An isolated polyketide made by the polyketide synthase encoded by the M 

marinum polyketide synthase gene which comprises the oligonucleotide of SEQ ID NO:8. 

49. An isolated polyketide made by the Af. tuberculosis polyketide synthase gene 
ppsE, pks6, pks8, pks9, pksl or pks002c . 

50. A method for isolating a mutagenized Af. marinum bacterium which exhibits 
20 reduced virulence in a host susceptible to infection thereof compared to a non-mutagenized 

Af. marinum bacterium, comprising integrating a tagged transposon into the DNA of a Af. 
marinum bacterium in a manner effective to produced reduced virulence, and isolating said 
mutagenized bacterium. 

51. An isolated nucleic acid of claim 28, consisting essentially of the 
25 oligonucleotide of SEQ ED NO: 4, or a variant or fragment thereof. 

52. An isolated nucleic acid of claim 29, consisting essentially of the 
oligonucleotide of SEQ ID NO: 6, or a variant or fragment thereof. 
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53. An isolated nucleic acid of claim 30, consisting essentially of the 
oligonucleotide of SEQ ID NO: 8, or a variant or fragment thereof. 

54. An isolated nucleic acid of claim 31, consisting essentially of the 
oligonucleotide of SEQ ID NO: 1 1, or a variant or fragment thereof. 

5 55. An isolated nucleic acid of claim 32, consisting essentially of the 

oligonucleotide of SEQ ID NO: 13, or a variant or fragment thereof. 

56. An isolated nucleic acid of claim 33, consisting essentially of the 
oligonucleotide of SEQ ID NO: 21, or a variant or fragment thereof. 

57. An isolated nucleic acid of claim 34, consisting essentially of the 
1 0 oligonucleotide of SEQ ID NO: 23, or a variant or fragment thereof 

58. An isolated nucleic acid of claim 35, consisting essentially of the 
oligonucleotide of SEQ ID NO: 25, or a variant or fragment thereof. 

59. An isolated nucleic acid of claim 36, consisting essentially of the 
oligonucleotide of SEQ ID NO: 27, or a variant or fragment thereof. 

15 60. An isolated nucleic acid of claim 37, consisting essentially of the 

oligonucleotide of SEQ ID NO: 29, or a variant or fragment thereof. 

61. An isolated nucleic acid of claim 38, consisting essentially of the 
oligonucleotide of SEQ ID NO: 31, or a variant or fragment thereof. 

62. An isolated nucleic acid of claim 39, consisting essentially of the 
20 oligonucleotide of SEQ ID NO: 39, or a variant or fragment thereof. 

63. An isolated nucleic acid of claim 40, consisting essentially of the 
oligonucleotide of SEQ ID NO: 41, or a variant or fragment thereof. 

64. An isolated nucleic acid which is complementary to, or which can hybridize 
under high stringency conditions to, at least a portion of the isolated nucleic acid, or a 

25 variant thereof, of claim 5 1 . 



WO 01/19993 



PCTYUS00/25512 



-67- 

65. An isolated nucleic acid which is complementary to, or which can hybridize 
under high stringency conditions to, at least a portion of the isolated nucleic acid, or a 
variant thereof, of claim 52. 

66. An isolated nucleic acid which is complementary to, or which can hybridize 
under high stringency conditions to, at least a portion of the isolated nucleic acid, or a 
variant thereof, of claim 53. 

67. An isolated nucleic acid which is complementary to, or which can hybridize 
under high stringency conditions to, at least a portion of the isolated nucleic acid, or a 
variant thereof, of claim 54. 

68. An isolated nucleic acid which is complementary to, or which can hybridize 
under high stringency conditions to, at least a portion of the isolated nucleic acid, or a 
variant thereof, of claim 55. 

69. An isolated nucleic acid which is complementary to, or which can hybridize 
under high stringency conditions to, at least a portion of the isolated nucleic acid, or a 
variant thereof, of claim 56. 

70. An isolated nucleic acid which is complementary to, or which can hybridize 
under high stringency conditions to, at least a portion of the isolated nucleic acid, or a 
variant thereof, of claim 57. 

71. An isolated nucleic acid which is complementary to, or which can hybridize 
under high stringency conditions to, at least a portion of the isolated nucleic acid, or a 
variant thereof, of claim 58. 

72. An isolated nucleic acid which is complementary to, or which can hybridize 
under high stringency conditions to, at least a portion of the isolated nucleic acid, or a 
variant thereof, of claim 59. 

73. An isolated nucleic acid which is complementary to, or which can hybridize 
under high stringency conditions to, at least a portion of the isolated nucleic acid, or a 
variant thereof, of claim 60. 
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74. An isolated nucleic acid which is complementary to, or which can hybridize 
under high stringency conditions to, at least a portion of the isolated nucleic acid, or a 
variant thereof, of claim 61 . 

75. An isolated nucleic acid which is complementary to, or which can hybridize 
under high stringency conditions to, at least a portion of the isolated nucleic acid, or a 
variant thereof, of claim 62. 

76. An isolated nucleic acid which is complementary to, or which can hybridize 
under high stringency conditions to, at least a portion of the isolated nucleic acid, or a 
variant thereof, of claim 63. 
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SEQUENCE LISTING 

<110> UNIVERSITY OF MARYLAND 

UNITED STATES GOVERMENT, as represented by 

Department of Veterans Affairs 
TRUCKS IS , Michele 

<120> VIRULENCE GENES OF M. MARINUM AND M. TUBERCULOSIS 
<130> VET 1 WO 

<140> 
<141> 

<160> 46 

<170> Patentln Ver. 2.1 

<210> 1 
<211> 16 
c212> DMA 

<213> Artificial Sequence 
<220> 

*223> Description of Artificial Sequence: Primer 
<400> l 

ctaggtacct acaacctc 

<210> 2 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 2 

catggtaccc attctaac 



<210> 3 . 
<211> B9 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: template RTl 
oligonucleotide 

<220> 

<223> -n« represents a, t, c, g, other or unknown 
<400> 3 

ctaggtacct acaacctcaa gcttnknknk nknknknknk nknJmknknk nknknknknfc 60 
nknkaagctt ggttagaatg ggtaccatg bu 
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<210> 4 
<2U> 169 
<212> DKA 

<213> Mycobacterium marinum 
<220> 

<223> Mutant 41.2 
c220> 

<223> w n w represents a, t, c. g, other or unknown 

<400> 4 

cgggccgatc tatgacgagn acgacgggac agatgggtcc ccggatggtc tacaccgaga 60 
ccaaactgaa cccgtcgttc tccttcggcg ggcccaagtg tctggtgaag gtgatccaaa 120 
aactgtccgg gttgagcacc aaccggttca tcgccatcga cttcgtcgg 169 



<210> 5 
<211> 55 
<212> PRT 

<213> Mycobacterium marinum 
<220> 

<223> Mutant 41.2 
<220> 

c223> "Xaa" represents any, other or unknown amino acid 
<400> 5 

Gly Arg Ser Met Thr Xaa Thr Thr Gly Gin Met Gly Pro Arg Met Val 
1 5 10 15 

Tyr Thr Glu Thr Lye Leu Aen Ser Ser Phe Ser Phe Gly Gly Pro Lys 
20 25 30 

Cye Leu Val Lys Val lie Gin Lye Leu Ser Gly Leu Ser lie Asn Arg 
35 40 45 

Phe lie Ala He Asp Phe Val 
50 55 



<210> 6 
<211> 3fl2 
<212> DNA 

<213> Mycobacterium marinum 
<220> 

<223> Mutant 80.1 
<220> 

<223> "n* represents a f t, c, g, other or unknown 
<400> 6 

acctcctgaa tgtgtgacat ggccctagaa ccctgcntta gactatttac atacatggct 60 
tcacccggcc gcctgtgcca ctcataagac taceggaacg gaccaacaat cgcacagtca 120 
tctgaagcag gagtctgtca accacaggcc ctgaaggaac agtgactgcg cagagaaaga 180 
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cggcaatgca tcctgttaac taagtggctg 
cctgaaatct ggaggagaag gtatagtgag 
ccagagtctc cactgagctg ccaagcttgc 
ggggcctcta aacgggtctt ga 



3 

gaggagtgcc aggtcattcc aaagaacatc 240 
caccccaaaa tttcaactgg agacateana 300 
ggccgcactc gagtaactag ttaacccctt 360 

382 



<210> 7 
<2U> 121 
<212> PRT 

<213> Mycobacterium marinum 
c220> 

<223> Mutant 80.1 
<220> 

c223> "Xaa" represents any, other or unknown amino acid 
<400> 7 

Pro Pro Glu Cys Val Thr Trp Pro Asn Pro Ala Leu Asp Tyr Leu His 
1 5 10 15 

Thr Trp Leu His Pro Ala Ala Cys Ala Thr His Lys Thr Thr Gly Met 
20 25 30 

Asp Gin Gin Ser His Ser His Leu Lys Gin Glu Ser Val Asn His Arg 
35 40 45 

Pro Arg Asn Ser Asp eye Ala Glu Lys Aep Gly Aen Ala Ser Cya Leu 
50 55 60 

Ser Gly Trp Arg Ser Ala Arg Ser Phe Gin Arg Thr Ser Leu Lys Ser 
65 70 75 B0 

Gly Gly Glu Gly lie Val Ser Thr Pro Lys Phe Gin Leu Glu Thr Ser 
85 90 95 

Xaa Gin Ser Leu Tyr Ala Ala Lys Leu Ala Ala Ala Leu Glu Leu Val 
100 105 110 

Asn Pro Leu Gly Pro Leu Asn Gly Ser 
US 120 



<210> 8 
c211> 172 
<212> DNA 

c213> Mycobacterium marinum 
<220> 

<223> Mutant 66.1 
<400> 8 

teategctaa ccggttgagc taccgcccgc acagcgtgcc catcatctcc aacctgaccg 60 
gctcacttgc cacagtcgag caactcacat cgccccgcta ttgggcacag catgtacggg 120 
agccagtgcg gtttcatgac ggcgttaccg gcttgttggc aggeggagaa ca * 172 



<210> 9 
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<211> S5 
<212> PRT 

<213> Mycobacterium marinum 
c220> 

<223> Mutant 86.1 
<400> 9 

Ala Aan Arg Leu Ser Tyr Arg Pro His Ser Val Pro lie lie Ser Asn 
15 10 15 

Leu Thr Gly Ser Leu Ala Thr Val Glu Gin Leu Thr Ser Pro Arg Tyr 
20 25 30 

Trp Ala Gin Hie Val Arg Glu Pro Val Arg Phe His Asp Gly Val Thr 
35 40 45 

Gly Leu Leu Ala Gly Gly Glu 
50 55 



<210> 10 
<211> 22B 
«212> DMA 

<213> Mycobacterium marinum 
c220> 

<223> Mutant 62.2 



<400> 10 

gatccggtgc cgccttgacc ggccgcgcca ccagtaccgc cgacgccgcc ctggccgccg 60 
gcttgtgcgg cttgcgatgg gtcggtgctg tcggtgccgg tgcctccggt gccgccttgg 120 
cctccggttc cgccggtgcc gccctggccg ccggcgcctt ggatgccgcc ggtgccggct 180 
ccggctgcac cgcccgttcc gccggttccg cctgcgccgc cggtgcct " 228 



<210> 11 
c211> 225 
<212> DMA 

<213> Mycobacterium marinum 
<220> 

<223> Mutant 62.2 

<220> 

<221> CDS 

<222> {1} (225) 



c400> 11 

ggc acc ggc ggc gca ggc gga acc ggc gga acg ggc ggt gca gcc gga 48 
Gly Thr Gly Gly Ala Gly Gly Thr Gly Gly Thr Gly Gly Ala Ala Gly 
1 5 10 15 



acc ggc acc ggc ggc ate caa ggc gcc ggc ggc cag ggc ggc acc ggc 96 
Thr Gly Thr Gly Gly He Gin Gly Ala Gly Gly Gin Gly Gly Thr Gly 
20 25 30 
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gga acc gga ggc caa ggc ggc acc gga ggc acc ggc acc gac age acc 144 
Gly Thr Gly Gly Gin Gly Gly Thr Gly Gly Thr Gly Thr Asp Ser Thr 
35 40 45 

gac cca teg caa gcc gca caa gcc ggc ggc cag ggc ggc gcc ggc ggt 192 
Asp Pro Ser Gin Ala Ala Gin Ala Gly Gly Gin Gly Gly Val Gly Gly 
50 55 60 

act ggt ggc gcg gcc ggt caa ggc ggc acc gga 225 
Thr Gly Gly Ala Ala Gly Gin Gly Gly Thr Gly 
65 J 70 75 



<210> 12 
<211> 75 
<212> PRT 

<213> Mycobacterium marinum 
<220> 

<223> Mutant 62.2 
<400> 12 

Gly Thr Gly Gly Ala Gly Gly Thr Gly Gly Thr Gly Gly Ala Ala Gly 
15 10 15 

Thr Gly Thr Gly Gly He Gin Gly Ala Gly Gly Gin Gly Gly Thr Gly 
20 25 30 

Gly Thr Gly Gly Gin Gly Gly Thr Gly Gly Thr Gly Thr Asp Ser Thr 
35 40 45 

Asp Pro Ser Gin Ala Ala Gin Ala Gly Gly Gin Gly Gly Val Gly Gly 
50 55 * 60 

Thr Gly Gly Ala Ala Gly Gin Gly Gly Thr Gly 
65 70 ' * 75 



«210> 13 
<211> 2B5 
<212> DNA 

<213> Mycobacterium marinum 
<220> 

*223* Mutant 67.1 
<400> 13 

ggtcgaagac tateggtatg ctccatagcg ttccgtcggg aagctgcatg ttgtcaaggg 60 
tttegtcgae ctctcggcga cccatgaatc ccgatagtgg cgtgaagaaa cegtacgaga 120 
tgctgatcac ctcgtgggcg gtcgccttcg atategggat gcgcaccaat ccctcaatcc 180 
ggccggccac gttttccctt tccaccctgt cgacgagtgg gtgtccgtta tggcctaaat 24 0 
aatccatctt gctgcctctt tctgaaatcg aatttattac tatcg 285 



<210> 14 
<211> 93 
<212> PRT 

<213> Mycobacterium marinum 
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<220> 

<223> Mutant 67.1 
<400> 14 

Ser Lye Thr He Gly Met Leu Hia Ser Val Pro Ser Gly Ser Cya Met 
1 5 10 ' 15 

Leu Ser Arg Val Ser Ser Thr Ser Arg Arg Pro Met Aan Pro Asp Ser 
20 25 30 

Gly Val Lya Lys Pro Tyr Glu Met Leu He Thr Ser Trp Ala Val Ala 
35 40 45 

Phe Aep He Gly Met Arg Thr Aen Pro Ser He Arg Pro Ala Thr Phe 
50 55 60 

Ser Leu Ser Thr Leu Ser Thr Ser Gly Cya Pro Leu Trp Pro Lys Ser 
€5 70 75 80 

He Leu Leu Pro Leu Ser Glu He Glu Phe He Thr He 
85 90 

<210> 15 
<211> 90 
<212> PRT 

<213> Mycobacterium maxinura 
<220> 

<223> Mutant 67.1 
<400> 15 

Val Glu Aep Tyr Arg Tyr Ala Pro Arg Ser Val Gly Lyo Leu Hie Val 

Val Lye Gly Phe Val Asp Leu Ser Ala Thr His Glu Ser Arg Trp Arq 
20 25 30 

Glu Glu Thr Val Arg Asp Ala Asp His Leu Val Gly Gly Arg Leu Arg 
35 40 45 

Tyr Arg Asp Ala Hia Gin Ser Leu Aan Pro Ala Gly Hie Val Phe Pro 
50 SS go 

Phe Hia Pro Val Asp Glu Trp Val Ser Val Met Ala He He His Leu 
65 70 75 BO 

Ala Ala Ser Phe Asn Arg lie** Tyr Tyr Tyr 
85 90 

<210> 16 
<211> 92 
<212> PRT 

<213> Mycobacterium marinum 
<220> 

<223> 67.1 
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Cye Ser lie Ala Phe Arg Arg Glu Ala Ala 
10 15 

Arg Pro Leu Gly Asp Pro lie Pro lie Val 
25 30 

cye Ser Pro Arg Gly Arg Ser Pro Ser lie 
40 45 

Pro Gin ser Gly Arg Pro Arg Phe Pro Phe 
55 60 

Gly Val Arg Tyr Gly Leu Asn Asn Pro Ser 

75 ' 80 

Ser Aan Leu Leu Leu Ser 
90 



<400> IS 

Gly Arg Arg Leu Ser Val 
1 5 

Cys Cye Gin Gly Phe Arg 
20 

Ala Arg Asn Arg Thr Arg 
35 

Ser Gly Cye Ala Pro lie 
50 

Pro Pro Cys Arg Arg Val 
65 - 70 

Cye Cys Leu Phe Leu Lys 

as 

<210> 17 
<211> 285 
c212> DNA 
<213> Mycobacterium marinum 

<220> 

<223> Mutant 67.1 
<400» 17 

cgatagtaat aaattcgatt tcagaaagag 
gacacccact cgtcgacagg gtggaaaggg 
tgcgcatccc gatatcgaag gcgaccgccc 
tcacgccact ategggatte atgggtcgcc 
agcttcccga eggaaegcta tggagcatac 



gcagcaagat ggattattta ggccataacg 60 
aaaacgtggc eggceggatt gagggattgg 120 
acgaggtgat cagcatctcg tacggtttct 180 
gagaggtcga cgaaaccctt gaeaacatge 240 
cgatagtctt cgacc 2SS 



c210> 18 
<211> 89 
<212> PRT 

<213> My cobac t er iura marinum 
<220> 

<223> Mutant 67.1 
<400> 18 

Arg lie Arg Phe Gin Lys Glu Ala Ala Arg Trp He He Ala He Thr 
1 5 10 is 

Asp Thr Hie Ser Ser Thr Gly Trp Lys Gly Lye Thr Trp Pro Ala Gly 
20 25 30 

Leu Arg Asp Trp Cys Ala Ser Arg Tyr Arg Arg Arg Pro Pro Thr Arg 
35 40 45 

Ser Ala Ser Arg Thr Val Ser Ser Arg His Tyr Arg Asp Ser Trp Val 
50 5S 60 

Ala Glu Arg Ser Thr Lys Pro Leu Thr Thr Cys Ser Phe Pro Thr Glu 
65 70 75 B0 
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Arg Tyr Gly Ala Tyr Arg Ser Ser Thr 
85 

<210> 19 
<211> 91 
<212> PRT 

c213> Mycobacterium marinum 
<220> 

c223> Mutant 67.1 
<400> 19 

Asp Ser Aen Lye Phe Asp Phe Arg Lye Arg Gin Gin Asp Gly Leu Phe 
15 10 15 

Arg Pro Arg Thr Pro Thr Arg Arg Gin Gly Gly Lye Gly Lys Arg Gly 
20 25 30 

Arg Pro Aap Gly lie Gly Ala Hia Pro Asp He Glu Gly ABp Arg Pro 
35 40 45 

Arg Gly Aep Gin His Leu Val Arg Phe Leu Hie Ala Thr He Gly He 
50 55 60 . 

His Gly ser Pro Arg Gly Arg Arg Aen Pro Gin His Ala Ala Ser Arg 
65 70 75 BO 

Arg Asn Ala Met Glu His Thr Asp Ser Leu Arg 
85 90 

c210> 20 
<211> 94 
<212> PRT 

<213> Mycobacterium marinum 
<220> 

<223> Mutant 67.1 
<400> 20 

He val lie Asn Ser He Ser Glu Arg Gly Ser Lys Met Asp Tyr Leu 
1 5 10 15 

Gly Hie Asn Gly His Pro Leu Val Asp Arg Val Glu Arg Glu Asn Val 
20 25 30 

Ala Gly Arg He Glu Gly Leu Val Arg He Pro He Ser Lys Ala Thr 
35 . 40 45 

Ala Hia Glu Val He Ser He Ser Tyr Gly Phe Phe Thr Pro Leu Ser 
50 55 60 

Gly Phe Met Gly Arg Arg Glu Val Aap Glu Thr Leu Asp Asn Met Gin 
65 70 75 80 

Leu Pro Asp Gly Thr Leu Trp Ser He Pro He Val Phe Asp 
85 9Q 
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<210> 21 
<211> 167 
<212> DNA 

<213* Mycobacterium marinum 
<220> 

<223> Mutant 80.8 
<400» 21 

ccaattagcc gattattcct cgggcgtgct caacgccaag gactacatac caggttactt 60 
ccactaaaat tcgcgggccc cgatcggcga eattactcga cggttttcgg gggaatctca 120 
gcggcgatgg catfccttgag ggcgacgtag cgtttggcgt cgggatc 167 

<210> 22 
<211> 53 
<212> PRT 

<213> Mycobacterium marinum 

<220> 

<223> Mutant 80.9 
<400> 22 

Aep Pro Asp Ala Lye Arg Tyr Val Ala Leu Lye Asn Ala lie Thr Ala 
1 5 10 15 

Glu lie Pro Pro Lya Thr Val Glu cys Arg Arg Ser Gly Pro Ala Asn 
20 25 30 

Phe Ser Gly Ser Asn Leu He Cys Ser Pro Trp Arg Ala Arg Pro Arg 
35 40 45 

Asn Asn Gin Leu lie 
50 

<210> 23 
<211> 144 
<212> DNA 

«=213> Mycobacterium marinum 
<220> 

<223> Mutant 39. 
<400> 23 

gatccgctgg acggcaccaa agaatccatc aagggcagcg atgagttcac cgtcaacatc 60 
gccctggtcg agaaccagga acccattctc ggggcaatct acggtccagc gaagcaactt 120 
ctgcaccacg cggccaaagg ggct " " 144 



<210> 24 
<211> 46 
<212> PRT 

<213> Mycobacterium marinum 
c220> 

<223> Mutant 39.2 
<400> 24 
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Leu Asp Gly Thr Lys Glu Phe He Lys Gly Ser Asp Glu Phe Thr Val 
1 5 10 is 

Afln He Ala Leu Val Glu Asn Gin Glu Pro He Leu Gly Ala He Tyr 
20 25 30 

Gly Pro Ala Lys Gin Leu Leu His Tyr Ala Ala Lys Gly Ala 
35 40 45 

<210> 25 
<211> 381 
<212> BKA 

<213> Mycobacterium marinum 
<220* 

<223> Mutant 114.7 
<22D> 

<223> "n» represents a, t, c, g, other or unknown 
<400> 25 

agccgtattt cgccattgag agrttggggtc ttgagatcgg cactggaagg ggacagcgtg €0 
ctattgcctc ttggtccgcc cttgccacct gatgctgtgg cggctaaacg gggtgagtcg 120 
gggctgctct gcggcttgtc ggttccgctc agctggggta cggccgttcc gccggatgac 180 
tacnaccatt gggcaccgga gcctgaagaa ggcgccgagg ccgtggtcga agaaaacgtg 240 
gatgcggcag ctgccggtac cgacgagtgg gacgagtggg cggaatggag ggagtgggag 300 
gcagcaaatg cccgaacctc attttcgaga tgccccgtac cagcagccgt gatacccgaa 360 
ctcgccggcg gccggttgag a ^ 3B1 



<210> 26 
<211^ 122 
<212> PRT . 

<213> Mycobacterium maxinum 
<220> 

<223> Mutant 114.7 
<220> 

<223> "Xaa" represents any, other or unknown amino acid 
<400> 26 

Leu Arg Val Gly Val Leu Arg Ser Ala Leu Glu Gly Asp Ser val Leu 
1 5 10 15 

Leu Pro Leu Gly Pro Pro Leu Pro Pro Asp Ala Val Ala Ala Lys Arg 
20 25 30 

Gly Glu Ser Gly Leu Leu Cys Gly Leu Ser Val Pro Leu Ser Trp Gly 
35 40 45 

Thr Ala Val Pro Pro Asp Asp Tyr Xaa His Trp Ala Pro Glu Pro Glu 
50 SS 60 

Glu Gly Ala Glu Ala Val Val Glu Glu Asn Val Aap Ala Ala Ala Ala 
65 70 75 80 
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Gly Thr Asp Glu Trp Asp Glu Trp Ala Glu Trp Arg Glu Trp Glu Ala 
B5 90 ~ 95 

Ala Asn Ala Arg Thr Ser Phe Ser Arg Cys Pro Val Pro Ala Ala Val 
100 105 110 

lie Pro Glu Leu Ala Gly Gly Arg Leu Arg 

115 120 



<210> 27 
<211> 96 
<212> DNA 

<213> Mycobacterium marinum 
<220> 

c223> Mutant 32.2 
<220> 

<223> "n" represents a, t, c, g, other or unknown 
<400> 21 

tccanncaga ggngcacgta gancgtagga cggaangcgg ngngatcgnc aatacggctg 60 
gcnctgcnag aactgntcga gggcctgcng ctggggcc 90 



<210> 26 
<211> 32 
<212> PRT 

<213> Mycobacterium marinum 
c220> 

«223> Mutant 32.2 
<220> 

c223> "Xaa" represents any, other or unknown amino acid 
c400> 28 

Ala Pro Ala Ala Gly Pro Arg Xaa Val Leu Ala Xaa Pro Ala Val Leu 

1 5 10 ,15 

Xaa lie xaa Pro Xaa Ser Val Leu Arg Ser Thr Cys Xaa Ser Xaa Trp 
20 25 30 



<210> 29 
<2U> 62 
<212> DNA 

«213> Mycobacterium marinum 
<220> 

<223> Mutant 42.2 
<220> 

c223> "n- represents a, t, c, g, other or unknown 



<400> 29 

tttgcaatcc acctgtacgc ggaactnttn annnccgttt tgccttgncg aataagctag 60 
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<210> 30 
<211> 19 
<212> PRT 

<213> Mycobacterium marinum 
<220> 

<223> Mutant 42.2 
c220> 

<223> "Xaa" represents any, other or unknown amino acid 
<400> 30 

Ser Leu lie Arg Gin Oly Lys Thr Xaa Xaa Xaa Ser Ser Ala Tyr Arg 
1 5 10 15 

Trp He Ala 



<210> 31 
<211> 74 
<212> DNA 

<213> Mycobacterium marinum 
<220> 

<223> Mutant 60.2 
<220> 

c223> B n" represents a, t, c, g. other or unknown 
<400> 31 

ccanacctat ctgtttncag nttnagacna cggnatctca cgcgnttggg cccngccacc 60 
aaacgccgcg tnga " 74 

<210> 32 
<211> 24 
c212> PRT 

<213> Mycobacterium marinum 
<220> 

<223> Mutant 60.2 
<220> 

<223> "Xaa* represents any, other or unknown amino acid 
«400> 32 

Xaa Pro lie Cys Xaa Gin Xaa Xaa Thr Thr Xaa Ser His Ala Xaa Gly 
1 5 10 15 

Pro xaa His Gin Thr Pro Arg Xaa 
20 



<210> 33 
c211> 24 
<212> PRT 



WO 01/19993 



PCT/USOO/25512 



13 

<213> Mycobacterium mar imam 
<220> 

c223> Mutant 60.2 
«220> 

«223> tt Xaa n represents any. other or unknown amino acid 



<400> 33 

Xaa Thr Tyr Leu Phe Xaa Xaa Xaa Asp Xaa Gly He Ser Arg Xaa Trp 
15 10 15 



Ala Xaa Pro Pro Aan Ala Ala Xaa 
20 

<210> 34 
<211> 24 
<212> PRT 

<213> Mycobacterium marinum 
<220> 

<223> Mutant 60.2 
<220> 

<223> M Xaa D represents any r other or unknown amino acid 
<400> 34 

Pro Xaa I*eu Ser Val Xaa Xaa Xaa Arg Xaa Arg Xaa Leu Thr Arg Leu 
15 10 15 

Gly Pro Ala Thr Lys Arg Arg Val 
20 

<210> 35 
<211> 74 
<212> DNA 

<213> Mycobacterium marinum 
<220> 

<223> Mutant 60.2 
c220> 

<223> -n» represent e a, t, c, g. other or unknown 
<400> 35 

tcnacgcggc gtttggtggc ngggcccaan cgcgtgagat nccgtngtct naanctgnaa 60 
acagataggt ntgg 74 



<210> 36 
<211> 24 
<212> PRT 

<213> Mycobacterium marinum 
c220> 

<223> Mutant 60.2 
c220> 
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<223> "Xaa" represents any. other or unknown amino acid 
<400> 36 

Ser Thx Arg Arg Leu Val Ala Gly Pro Xaa Arg Val Arg Xaa Arg Xaa 
1 5 10 ' 15 

Leu Xaa Leu Xaa Thr Asp Arg Xaa 
20 

<210> 37 
<211> 23 
<212> PRT 

<213> Mycobacterium marinum 
<220> 

c223» Mutant 60.2 
<220> 

<223> "Xaa" represents any, other or unknown amino acid 
<400=> 37 

Xaa Arg Gly Val Trp Trp Xaa Gly Pro Xaa Ala Asp Xaa Val Val Xaa 
15 io is 

Xaa Xaa Lys Gin lie Gly Xaa 
20 

<210:> 3B 
c211> 23 
<212> PRT 

*213> Mycobacterium marinum 
<22Q> 

<223> Mutant 60.2 
c220> 

<223> "Xaa » represents any, other or unknown amino acid 
c400> 36 

Xaa Ala Ala Phe Gly Gly Xaa Ala Gin Xaa Arg Glu Xaa Pro Xaa Ser 

■ 1 5 10 15 

Xaa Xaa Xaa Asn Arg Val Trp 
20 

<210> 39 
<211> 247 
<212> DNA 

<213> Mycobacterium marinum 
<220> 

<223> Mutant 68.6 
<400> 39 

aaatcatcat ctatcgttac ccggggcaag ccaagcacct cagcaaaaat tctgcagagc 60 
atttcctctt gcggagttcg cggcatacgg ccaatcgccg catgatgatc gggcacaggc 120 
agcgctttac gatccaccct cttattcgga gtcaacggca tggtctcaag tcttacgatq 1B0 
acagacggca ccatatattc ggccagtttc agggaggcgt agcgccgcag ttctgctgta 240 
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tctatca 247 



<2X0> 40 
<211> Bl 
<212> PRT 

<213> Mycobacterium marinum 
c220> 

<223> Mutant 68.6 
<400* 40 

He Asp Thr Ala Glu Leu Arg Arg Tyr Ala Ser Leu Lya Leu Ala Glu 
1 5 10 15 

Tyr Met Val Pro Ser Val He Val Arg Leu Glu Thr Met Pro Leu Thr 
20 25 30 

Pro Aen Lye Lye Val Asp Arg Lys Ala Leu Pro Val Pro Aep His His 
35 40 45 

Ala Ala He Gly Arg Met Pro Arg Thr Pro Gin Glu Glu Met Leu Cys 
50 55 60 

Arg He Phe Ala Glu Val Leu Gly Leu Pro Arg Val Thr He Asp Asp 
65 70 75 80 

Asp 



<210> 41 
c211> 164 
<212> DMA 

<213> Mycobacterium marinum 
<220> 

<223> Mutant 95.3 
<400> 41 

gattagctta ttcctcaagg cacgagcgat tagcttattc ctcaaggcac gagcgactag 60 
cctatrcccc aaggcacgag cttcgcactt gacggtgtag agctcaatag cttattcctc 120 
aaggcacgag ctcgacttcg cacttgacgg tgtagagctc aaag 164 



<210> 42 
<211> 50 
<212> PRT 

<213> Mycobacterium marinum 
<220> 

c223> Mutant 95.3 
c400» 42 

Asp Leu He Pro Gin Gly Thr Ser Asp Leu He Pro Gin Gly Thr Ser 
1 5 10 15 



Aap Leu He Pro Gin Gly Thr Ser Phe Ala Leu Asp Gly Val Glu Leu 
20 25 30 
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Asn Ser Leu Phe Leu Lys Ala Arg Ala Arg Leu Arg Thr Arg Cys Arg 
35 40 45 

Ala Oln 

50 



<210> 43 
*211> 138 
<212> DNA 

<213> Mycobacterium marinura 

<220> 

<223> Mutant 39.2 

<220> 

c221> CDS 

<222> (1)..(138) 

c400> 43 

ctg gac ggc ace aaa gaa ttc ate aag ggc age gat gag ttc acc gtc 
Leu Asp Gly Thr Lys Glu Phe lie Lye Gly Ser Asp Glu Phe Thr Val 
1 5 10 15 

aac ate gcc ctg gtc gag aac cag gaa ccc att etc ggg gca ate tac 
Asn lie Ala Leu Val Glu Asn Gin Glu Pro lie Leu Gly Ala lie Tyr 

20 v 25 30 

ggt cca gcg aag caa ctt ctg cac tac gcg gcc aaa ggg get 
Gly Pro Ala Lys Gin Leu Leu His Tyr Ala Ala LyB Gly Ala 
35 40 45 



*210> 44 
<211> 366 
c212> DNA 

<213> Mycobacterium marinum 
<220> 

<223* Mutant 114.7 
<220> 

<223> "n" represents a f t, c, g, other or unknown 

<220> 
<221> CDS 
<222> (1),.{366) 

<400> 44 

ttg aga gtt ggg gtc ttg aga teg gca ctg gaa ggg gac age gtg eta 48 
Leu Arg Val Gly Val Leu Arg Ser Ala Leu Glu Gly Aep Ser Val Leu 
15 10 15 

ttg cct ctt ggt ccg ccc ttg cca cct gat get gtg gcg get aaa egg 96 
Leu Pro Leu Gly Pro Pro Leu Pro Pro Asp Ala Val Ala Ala Lys Arg 
20 25 3 0 



96 
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ggt gag teg ggg ctg etc tgc ggc ttg teg gtt ccg etc age tgg ggt 144 
Gly Glu Ser Gly Leu Leu eye Gly Leu Ser Val Pro Leu Ser Trp Gly 
35 40 45 



acg gee gtt ccg ccg gat gac tac nac cat tgg gca ccg gag cct gaa 192 
Thr Ala Val Pro Pro Asp Asp Tyr Xaa Hie Trp Ala Pro Glu Pro Glu 
50 55 60 

gaa ggc gee gag gec gtg gtc gaa gaa aac gtg gat gcg gca get gec 240 
Glu Gly Ala Glu Ala Val Val Glu Glu Aen Val Asp Ala Ala Ala Ala 
65 70 75 60 

ggt acc gac gag tgg gac gag tgg gcg gaa tgg agg gag tgg gag gca 28 B 
Gly Thr Asp Glu Trp Asp Glu Trp Ala Glu Trp Arg Glu Trp Glu Ala 
85 90 95 

gca aat gec cga acc tea ttt teg aga tgc ccc gta cca gca gec gtg 336 
Ala Asn Ala Arg Thr Ser Phe Ser Arg Cys Pro Val Pro Ala Ala Val 
100 105 110 

ata ccc gaa etc gec ggc ggc egg ttg aga 366 
lie Pro Glu Leu Ala Gly Gly Arg Leu Arg 
115 120 



<21Q> 45 
<211> 12 
<212> DMA 

«213^ Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 45 

gategctegt gc 12 



<210> 46 
<211> 25 
<212> DMA 

c213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequencer Synthetic 
01 igonucl eot ide 



c400> 46 

ttattcctca aggcacgagc gatcc 
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This International Searching Authority found multiple (groups of) 
inventions in this international application, as follows: 

1. Claims: 1-5, 28-42, 46, 50-76 

Method for identifying a virulence gene of M. marinum, 
avirulent M. marinum, isolated nucleic acid comprising 
oligonucleotide of SEQ ID NO: 4, 6, 8, 11, 13, 21, 23, 25, 
27, 29, 31, 39, 41 and nucleic acids which is complementary 
to or which can hybridize under conditions of high 
stringency to a portion of said nucleic acid identified by 
said SEQ ID NOs, pharmaceutical composition comprising 
, avirulent M. marinum bacterium, attenuated M. marinum 
vaccine comprising said avirulent bacterium, method for 
isolating a mutagenized M. marinum bacterium which exhibits 
reduced virulence in a host. 



2. Claims: 6-27, 43-45, 47 

Method for identifying a virulence gene of M. tuberculosis, 
method for generating avirulent M. tuberculosis bacterium, 
avirulent M. tuberculosis comprising one or more mutated 
genes according to claims 9-27, pharmaceutical composition 
comprising avirulent M. tuberculosis bacterium, attenuated 
M. tuberculosis vaccine comprising said avirulent bacterium. 



3. Claim : 48 



An isolated polyketide made by the M. marinum polyketide 
synthase gene. 



4. Claim : 49 



An isolated polyketide made by the M. tuberculosis 
polyketide synthase gene. 
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